summaryrefslogtreecommitdiffstats
path: root/drivers/iommu
Commit message (Collapse)AuthorAgeFilesLines
* iommu/vt-d: Don't queue_iova() if there is no flush queueDmitry Safonov2019-08-042-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit effa467870c7612012885df4e246bdb8ffd8e44c upstream. Intel VT-d driver was reworked to use common deferred flushing implementation. Previously there was one global per-cpu flush queue, afterwards - one per domain. Before deferring a flush, the queue should be allocated and initialized. Currently only domains with IOMMU_DOMAIN_DMA type initialize their flush queue. It's probably worth to init it for static or unmanaged domains too, but it may be arguable - I'm leaving it to iommu folks. Prevent queuing an iova flush if the domain doesn't have a queue. The defensive check seems to be worth to keep even if queue would be initialized for all kinds of domains. And is easy backportable. On 4.19.43 stable kernel it has a user-visible effect: previously for devices in si domain there were crashes, on sata devices: BUG: spinlock bad magic on CPU#6, swapper/0/1 lock: 0xffff88844f582008, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 CPU: 6 PID: 1 Comm: swapper/0 Not tainted 4.19.43 #1 Call Trace: <IRQ> dump_stack+0x61/0x7e spin_bug+0x9d/0xa3 do_raw_spin_lock+0x22/0x8e _raw_spin_lock_irqsave+0x32/0x3a queue_iova+0x45/0x115 intel_unmap+0x107/0x113 intel_unmap_sg+0x6b/0x76 __ata_qc_complete+0x7f/0x103 ata_qc_complete+0x9b/0x26a ata_qc_complete_multiple+0xd0/0xe3 ahci_handle_port_interrupt+0x3ee/0x48a ahci_handle_port_intr+0x73/0xa9 ahci_single_level_irq_intr+0x40/0x60 __handle_irq_event_percpu+0x7f/0x19a handle_irq_event_percpu+0x32/0x72 handle_irq_event+0x38/0x56 handle_edge_irq+0x102/0x121 handle_irq+0x147/0x15c do_IRQ+0x66/0xf2 common_interrupt+0xf/0xf RIP: 0010:__do_softirq+0x8c/0x2df The same for usb devices that use ehci-pci: BUG: spinlock bad magic on CPU#0, swapper/0/1 lock: 0xffff88844f402008, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.43 #4 Call Trace: <IRQ> dump_stack+0x61/0x7e spin_bug+0x9d/0xa3 do_raw_spin_lock+0x22/0x8e _raw_spin_lock_irqsave+0x32/0x3a queue_iova+0x77/0x145 intel_unmap+0x107/0x113 intel_unmap_page+0xe/0x10 usb_hcd_unmap_urb_setup_for_dma+0x53/0x9d usb_hcd_unmap_urb_for_dma+0x17/0x100 unmap_urb_for_dma+0x22/0x24 __usb_hcd_giveback_urb+0x51/0xc3 usb_giveback_urb_bh+0x97/0xde tasklet_action_common.isra.4+0x5f/0xa1 tasklet_action+0x2d/0x30 __do_softirq+0x138/0x2df irq_exit+0x7d/0x8b smp_apic_timer_interrupt+0x10f/0x151 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0010:_raw_spin_unlock_irqrestore+0x17/0x39 Cc: David Woodhouse <dwmw2@infradead.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: iommu@lists.linux-foundation.org Cc: <stable@vger.kernel.org> # 4.14+ Fixes: 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing") Signed-off-by: Dmitry Safonov <dima@arista.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> [v4.14-port notes: o minor conflict with untrusted IOMMU devices check under if-condition] Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* iommu: Fix a leak in iommu_insert_resv_regionEric Auger2019-07-261-3/+5
| | | | | | | | | | | | | | | [ Upstream commit ad0834dedaa15c3a176f783c0373f836e44b4700 ] In case we expand an existing region, we unlink this latter and insert the larger one. In that case we should free the original region after the insertion. Also we can immediately return. Fixes: 6c65fb318e8b ("iommu: iommu_get_group_resv_regions") Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
* iommu/arm-smmu: Avoid constant zero in TLBI writesRobin Murphy2019-06-191-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 4e4abae311e4b44aaf61f18a826fd7136037f199 upstream. Apparently, some Qualcomm arm64 platforms which appear to expose their SMMU global register space are still, in fact, using a hypervisor to mediate it by trapping and emulating register accesses. Sadly, some deployed versions of said trapping code have bugs wherein they go horribly wrong for stores using r31 (i.e. XZR/WZR) as the source register. While this can be mitigated for GCC today by tweaking the constraints for the implementation of writel_relaxed(), to avoid any potential arms race with future compilers more aggressively optimising register allocation, the simple way is to just remove all the problematic constant zeros. For the write-only TLB operations, the actual value is irrelevant anyway and any old nearby variable will provide a suitable GPR to encode. The one point at which we really do need a zero to clear a context bank happens before any of the TLB maintenance where crashes have been reported, so is apparently not a problem... :/ Reported-by: AngeloGioacchino Del Regno <kholk11@gmail.com> Tested-by: Marc Gonzalez <marc.w.gonzalez@free.fr> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Marc Gonzalez <marc.w.gonzalez@free.fr> Acked-by: Will Deacon <will.deacon@arm.com> Cc: stable@vger.kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* iommu/arm-smmu-v3: Don't disable SMMU in kdump kernelWill Deacon2019-06-151-6/+4Star
| | | | | | | | | | | | | | | | | | | | | [ Upstream commit 3f54c447df34ff9efac7809a4a80fd3208efc619 ] Disabling the SMMU when probing from within a kdump kernel so that all incoming transactions are terminated can prevent the core of the crashed kernel from being transferred off the machine if all I/O devices are behind the SMMU. Instead, continue to probe the SMMU after it is disabled so that we can reinitialise it entirely and re-attach the DMA masters as they are reset. Since the kdump kernel may not have drivers for all of the active DMA masters, we suppress fault reporting to avoid spamming the console and swamping the IRQ threads. Reported-by: "Leizhen (ThunderTown)" <thunder.leizhen@huawei.com> Tested-by: "Leizhen (ThunderTown)" <thunder.leizhen@huawei.com> Tested-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* iommu/vt-d: Set intel_iommu_gfx_mapped correctlyLu Baolu2019-06-151-3/+4
| | | | | | | | | | | | | | | | | | | | [ Upstream commit cf1ec4539a50bdfe688caad4615ca47646884316 ] The intel_iommu_gfx_mapped flag is exported by the Intel IOMMU driver to indicate whether an IOMMU is used for the graphic device. In a virtualized IOMMU environment (e.g. QEMU), an include-all IOMMU is used for graphic device. This flag is found to be clear even the IOMMU is used. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Reported-by: Zhenyu Wang <zhenyuw@linux.intel.com> Fixes: c0771df8d5297 ("intel-iommu: Export a flag indicating that the IOMMU is used for iGFX.") Suggested-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
* iommu/tegra-smmu: Fix invalid ASID bits on Tegra30/114Dmitry Osipenko2019-05-251-7/+18
| | | | | | | | | | | | | | | | | | commit 43a0541e312f7136e081e6bf58f6c8a2e9672688 upstream. Both Tegra30 and Tegra114 have 4 ASID's and the corresponding bitfield of the TLB_FLUSH register differs from later Tegra generations that have 128 ASID's. In a result the PTE's are now flushed correctly from TLB and this fixes problems with graphics (randomly failing tests) on Tegra30. Cc: stable <stable@vger.kernel.org> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* iommu/amd: Set exclusion range correctlyJoerg Roedel2019-05-101-1/+1
| | | | | | | | | | | | | | | | [ Upstream commit 3c677d206210f53a4be972211066c0f1cd47fe12 ] The exlcusion range limit register needs to contain the base-address of the last page that is part of the range, as bits 0-11 of this register are treated as 0xfff by the hardware for comparisons. So correctly set the exclusion range in the hardware to the last page which is _in_ the range. Fixes: b2026aa2dce44 ('x86, AMD IOMMU: add functions for programming IOMMU MMIO space') Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
* iommu/amd: Reserve exclusion range in iova-domainJoerg Roedel2019-05-043-6/+12
| | | | | | | | | | | | | | | | | [ Upstream commit 8aafaaf2212192012f5bae305bb31cdf7681d777 ] If a device has an exclusion range specified in the IVRS table, this region needs to be reserved in the iova-domain of that device. This hasn't happened until now and can cause data corruption on data transfered with these devices. Treat exclusion ranges as reserved regions in the iommu-core to fix the problem. Fixes: be2a022c0dd0 ('x86, AMD IOMMU: add functions to parse IOMMU memory mapping requirements for devices') Signed-off-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Sasha Levin (Microsoft) <sashal@kernel.org>
* iommu/dmar: Fix buffer overflow during PCI bus notificationJulia Cartwright2019-04-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit cffaaf0c816238c45cd2d06913476c83eb50f682 ] Commit 57384592c433 ("iommu/vt-d: Store bus information in RMRR PCI device path") changed the type of the path data, however, the change in path type was not reflected in size calculations. Update to use the correct type and prevent a buffer overflow. This bug manifests in systems with deep PCI hierarchies, and can lead to an overflow of the static allocated buffer (dmar_pci_notify_info_buf), or can lead to overflow of slab-allocated data. BUG: KASAN: global-out-of-bounds in dmar_alloc_pci_notify_info+0x1d5/0x2e0 Write of size 1 at addr ffffffff90445d80 by task swapper/0/1 CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 4.14.87-rt49-02406-gd0a0e96 #1 Call Trace: ? dump_stack+0x46/0x59 ? print_address_description+0x1df/0x290 ? dmar_alloc_pci_notify_info+0x1d5/0x2e0 ? kasan_report+0x256/0x340 ? dmar_alloc_pci_notify_info+0x1d5/0x2e0 ? e820__memblock_setup+0xb0/0xb0 ? dmar_dev_scope_init+0x424/0x48f ? __down_write_common+0x1ec/0x230 ? dmar_dev_scope_init+0x48f/0x48f ? dmar_free_unused_resources+0x109/0x109 ? cpumask_next+0x16/0x20 ? __kmem_cache_create+0x392/0x430 ? kmem_cache_create+0x135/0x2f0 ? e820__memblock_setup+0xb0/0xb0 ? intel_iommu_init+0x170/0x1848 ? _raw_spin_unlock_irqrestore+0x32/0x60 ? migrate_enable+0x27a/0x5b0 ? sched_setattr+0x20/0x20 ? migrate_disable+0x1fc/0x380 ? task_rq_lock+0x170/0x170 ? try_to_run_init_process+0x40/0x40 ? locks_remove_file+0x85/0x2f0 ? dev_prepare_static_identity_mapping+0x78/0x78 ? rt_spin_unlock+0x39/0x50 ? lockref_put_or_lock+0x2a/0x40 ? dput+0x128/0x2f0 ? __rcu_read_unlock+0x66/0x80 ? __fput+0x250/0x300 ? __rcu_read_lock+0x1b/0x30 ? mntput_no_expire+0x38/0x290 ? e820__memblock_setup+0xb0/0xb0 ? pci_iommu_init+0x25/0x63 ? pci_iommu_init+0x25/0x63 ? do_one_initcall+0x7e/0x1c0 ? initcall_blacklisted+0x120/0x120 ? kernel_init_freeable+0x27b/0x307 ? rest_init+0xd0/0xd0 ? kernel_init+0xf/0x120 ? rest_init+0xd0/0xd0 ? ret_from_fork+0x1f/0x40 The buggy address belongs to the variable: dmar_pci_notify_info_buf+0x40/0x60 Fixes: 57384592c433 ("iommu/vt-d: Store bus information in RMRR PCI device path") Signed-off-by: Julia Cartwright <julia@ni.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
* iommu/vt-d: Check capability before disabling protected memoryLu Baolu2019-04-201-0/+3
| | | | | | | | | | | | | | | | | [ Upstream commit 5bb71fc790a88d063507dc5d445ab8b14e845591 ] The spec states in 10.4.16 that the Protected Memory Enable Register should be treated as read-only for implementations not supporting protected memory regions (PLMR and PHMR fields reported as Clear in the Capability register). Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: mark gross <mgross@intel.com> Suggested-by: Ashok Raj <ashok.raj@intel.com> Fixes: f8bab73515ca5 ("intel-iommu: PMEN support") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
* iommu/io-pgtable-arm-v7s: Only kmemleak_ignore L2 tablesNicolas Boichat2019-04-051-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 032ebd8548c9d05e8d2bdc7a7ec2fe29454b0ad0 ] L1 tables are allocated with __get_dma_pages, and therefore already ignored by kmemleak. Without this, the kernel would print this error message on boot, when the first L1 table is allocated: [ 2.810533] kmemleak: Trying to color unknown object at 0xffffffd652388000 as Black [ 2.818190] CPU: 5 PID: 39 Comm: kworker/5:0 Tainted: G S 4.19.16 #8 [ 2.831227] Workqueue: events deferred_probe_work_func [ 2.836353] Call trace: ... [ 2.852532] paint_ptr+0xa0/0xa8 [ 2.855750] kmemleak_ignore+0x38/0x6c [ 2.859490] __arm_v7s_alloc_table+0x168/0x1f4 [ 2.863922] arm_v7s_alloc_pgtable+0x114/0x17c [ 2.868354] alloc_io_pgtable_ops+0x3c/0x78 ... Fixes: e5fc9753b1a8314 ("iommu/io-pgtable: Add ARMv7 short descriptor support") Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
* iommu/io-pgtable-arm-v7s: request DMA32 memory, and improve debuggingNicolas Boichat2019-04-031-4/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 0a352554da69b02f75ca3389c885c741f1f63235 upstream. IOMMUs using ARMv7 short-descriptor format require page tables (level 1 and 2) to be allocated within the first 4GB of RAM, even on 64-bit systems. For level 1/2 pages, ensure GFP_DMA32 is used if CONFIG_ZONE_DMA32 is defined (e.g. on arm64 platforms). For level 2 pages, allocate a slab cache in SLAB_CACHE_DMA32. Note that we do not explicitly pass GFP_DMA[32] to kmem_cache_zalloc, as this is not strictly necessary, and would cause a warning in mm/sl*b.c, as we did not update GFP_SLAB_BUG_MASK. Also, print an error when the physical address does not fit in 32-bit, to make debugging easier in the future. Link: http://lkml.kernel.org/r/20181210011504.122604-3-drinkcat@chromium.org Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32") Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Hsin-Yi Wang <hsinyi@chromium.org> Cc: Huaisheng Ye <yehs1@lenovo.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Sasha Levin <Alexander.Levin@microsoft.com> Cc: Tomasz Figa <tfiga@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yingjoe Chen <yingjoe.chen@mediatek.com> Cc: Yong Wu <yong.wu@mediatek.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* iommu/amd: fix sg->dma_address for sg->offset bigger than PAGE_SIZEStanislaw Gruszka2019-03-271-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 4e50ce03976fbc8ae995a000c4b10c737467beaa upstream. Take into account that sg->offset can be bigger than PAGE_SIZE when setting segment sg->dma_address. Otherwise sg->dma_address will point at diffrent page, what makes DMA not possible with erros like this: xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa70c0 flags=0x0020] xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7040 flags=0x0020] xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7080 flags=0x0020] xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7100 flags=0x0020] xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7000 flags=0x0020] Additinally with wrong sg->dma_address unmap_sg will free wrong pages, what what can cause crashes like this: Feb 28 19:27:45 kernel: BUG: Bad page state in process cinnamon pfn:39e8b1 Feb 28 19:27:45 kernel: Disabling lock debugging due to kernel taint Feb 28 19:27:45 kernel: flags: 0x2ffff0000000000() Feb 28 19:27:45 kernel: raw: 02ffff0000000000 0000000000000000 ffffffff00000301 0000000000000000 Feb 28 19:27:45 kernel: raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 Feb 28 19:27:45 kernel: page dumped because: nonzero _refcount Feb 28 19:27:45 kernel: Modules linked in: ccm fuse arc4 nct6775 hwmon_vid amdgpu nls_iso8859_1 nls_cp437 edac_mce_amd vfat fat kvm_amd ccp rng_core kvm mt76x0u mt76x0_common mt76x02_usb irqbypass mt76_usb mt76x02_lib mt76 crct10dif_pclmul crc32_pclmul chash mac80211 amd_iommu_v2 ghash_clmulni_intel gpu_sched i2c_algo_bit ttm wmi_bmof snd_hda_codec_realtek snd_hda_codec_generic drm_kms_helper snd_hda_codec_hdmi snd_hda_intel drm snd_hda_codec aesni_intel snd_hda_core snd_hwdep aes_x86_64 crypto_simd snd_pcm cfg80211 cryptd mousedev snd_timer glue_helper pcspkr r8169 input_leds realtek agpgart libphy rfkill snd syscopyarea sysfillrect sysimgblt fb_sys_fops soundcore sp5100_tco k10temp i2c_piix4 wmi evdev gpio_amdpt pinctrl_amd mac_hid pcc_cpufreq acpi_cpufreq sg ip_tables x_tables ext4(E) crc32c_generic(E) crc16(E) mbcache(E) jbd2(E) fscrypto(E) sd_mod(E) hid_generic(E) usbhid(E) hid(E) dm_mod(E) serio_raw(E) atkbd(E) libps2(E) crc32c_intel(E) ahci(E) libahci(E) libata(E) xhci_pci(E) xhci_hcd(E) Feb 28 19:27:45 kernel: scsi_mod(E) i8042(E) serio(E) bcache(E) crc64(E) Feb 28 19:27:45 kernel: CPU: 2 PID: 896 Comm: cinnamon Tainted: G B W E 4.20.12-arch1-1-custom #1 Feb 28 19:27:45 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450M Pro4, BIOS P1.20 06/26/2018 Feb 28 19:27:45 kernel: Call Trace: Feb 28 19:27:45 kernel: dump_stack+0x5c/0x80 Feb 28 19:27:45 kernel: bad_page.cold.29+0x7f/0xb2 Feb 28 19:27:45 kernel: __free_pages_ok+0x2c0/0x2d0 Feb 28 19:27:45 kernel: skb_release_data+0x96/0x180 Feb 28 19:27:45 kernel: __kfree_skb+0xe/0x20 Feb 28 19:27:45 kernel: tcp_recvmsg+0x894/0xc60 Feb 28 19:27:45 kernel: ? reuse_swap_page+0x120/0x340 Feb 28 19:27:45 kernel: ? ptep_set_access_flags+0x23/0x30 Feb 28 19:27:45 kernel: inet_recvmsg+0x5b/0x100 Feb 28 19:27:45 kernel: __sys_recvfrom+0xc3/0x180 Feb 28 19:27:45 kernel: ? handle_mm_fault+0x10a/0x250 Feb 28 19:27:45 kernel: ? syscall_trace_enter+0x1d3/0x2d0 Feb 28 19:27:45 kernel: ? __audit_syscall_exit+0x22a/0x290 Feb 28 19:27:45 kernel: __x64_sys_recvfrom+0x24/0x30 Feb 28 19:27:45 kernel: do_syscall_64+0x5b/0x170 Feb 28 19:27:45 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Cc: stable@vger.kernel.org Reported-and-tested-by: Jan Viktorin <jan.viktorin@gmail.com> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Fixes: 80187fd39dcb ('iommu/amd: Optimize map_sg and unmap_sg') Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* iommu/amd: Fix IOMMU page flush when detach device from a domainSuravee Suthikulpanit2019-03-131-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 9825bd94e3a2baae1f4874767ae3a7d4c049720e ] When a VM is terminated, the VFIO driver detaches all pass-through devices from VFIO domain by clearing domain id and page table root pointer from each device table entry (DTE), and then invalidates the DTE. Then, the VFIO driver unmap pages and invalidate IOMMU pages. Currently, the IOMMU driver keeps track of which IOMMU and how many devices are attached to the domain. When invalidate IOMMU pages, the driver checks if the IOMMU is still attached to the domain before issuing the invalidate page command. However, since VFIO has already detached all devices from the domain, the subsequent INVALIDATE_IOMMU_PAGES commands are being skipped as there is no IOMMU attached to the domain. This results in data corruption and could cause the PCI device to end up in indeterministic state. Fix this by invalidate IOMMU pages when detach a device, and before decrementing the per-domain device reference counts. Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Suggested-by: Joerg Roedel <joro@8bytes.org> Co-developed-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Fixes: 6de8ad9b9ee0 ('x86/amd-iommu: Make iommu_flush_pages aware of multiple IOMMUs') Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
* iommu/amd: Unmap all mapped pages in error path of map_sgJerry Snitselaar2019-03-131-1/+1
| | | | | | | | | | | | | | | | | | [ Upstream commit f1724c0883bb0ce93b8dcb94b53dcca3b75ac9a7 ] In the error path of map_sg there is an incorrect if condition for breaking out of the loop that searches the scatterlist for mapped pages to unmap. Instead of breaking out of the loop once all the pages that were mapped have been unmapped, it will break out of the loop after it has unmapped 1 page. Fix the condition, so it breaks out of the loop only after all the mapped pages have been unmapped. Fixes: 80187fd39dcb ("iommu/amd: Optimize map_sg and unmap_sg") Cc: Joerg Roedel <joro@8bytes.org> Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
* iommu/amd: Call free_iova_fast with pfn in map_sgJerry Snitselaar2019-03-131-1/+1
| | | | | | | | | | | | | | | | [ Upstream commit 51d8838d66d3249508940d8f59b07701f2129723 ] In the error path of map_sg, free_iova_fast is being called with address instead of the pfn. This results in a bad value getting into the rcache, and can result in hitting a BUG_ON when iova_magazine_free_pfns is called. Cc: Joerg Roedel <joro@8bytes.org> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Fixes: 80187fd39dcb ("iommu/amd: Optimize map_sg and unmap_sg") Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
* iommu/arm-smmu-v3: Use explicit mb() when moving cons pointerWill Deacon2019-02-121-1/+7
| | | | | | | | | | | | | | | | | | | | [ Upstream commit a868e8530441286342f90c1fd9c5f24de3aa2880 ] After removing an entry from a queue (e.g. reading an event in arm_smmu_evtq_thread()) it is necessary to advance the MMIO consumer pointer to free the queue slot back to the SMMU. A memory barrier is required here so that all reads targetting the queue entry have completed before the consumer pointer is updated. The implementation of queue_inc_cons() relies on a writel() to complete the previous reads, but this is incorrect because writel() is only guaranteed to complete prior writes. This patch replaces the call to writel() with an mb(); writel_relaxed() sequence, which gives us the read->write ordering which we require. Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* iommu/arm-smmu: Add support for qcom,smmu-v2 variantVivek Gautam2019-02-121-0/+3
| | | | | | | | | | | | | | | | | | [ Upstream commit 89cddc563743cb1e0068867ac97013b2a5bf86aa ] qcom,smmu-v2 is an arm,smmu-v2 implementation with specific clock and power requirements. On msm8996, multiple cores, viz. mdss, video, etc. use this smmu. On sdm845, this smmu is used with gpu. Add bindings for the same. Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Tested-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* iommu/arm-smmu-v3: Avoid memory corruption from Hisilicon MSI payloadsZhen Lei2019-02-121-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit 84a9a75774961612d0c7dd34a1777e8f98a65abd ] The GITS_TRANSLATER MMIO doorbell register in the ITS hardware is architected to be 4 bytes in size, yet on hi1620 and earlier, Hisilicon have allocated the adjacent 4 bytes to carry some IMPDEF sideband information which results in an 8-byte MSI payload being delivered when signalling an interrupt: MSIAddr: |----4bytes----|----4bytes----| | MSIData | IMPDEF | This poses no problem for the ITS hardware because the adjacent 4 bytes are reserved in the memory map. However, when delivering MSIs to memory, as we do in the SMMUv3 driver for signalling the completion of a SYNC command, the extended payload will corrupt the 4 bytes adjacent to the "sync_count" member in struct arm_smmu_device. Fortunately, the current layout allocates these bytes to padding, but this is fragile and we should make this explicit. Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> [will: Rewrote commit message and comment] Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
* iommu/amd: Fix amd_iommu=force_isolationYu Zhao2019-02-121-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit c12b08ebbe16f0d3a96a116d86709b04c1ee8e74 ] The parameter is still there but it's ignored. We need to check its value before deciding to go into passthrough mode for AMD IOMMU v2 capable device. We occasionally use this parameter to force v2 capable device into translation mode to debug memory corruption that we suspect is caused by DMA writes. To address the following comment from Joerg Roedel on the first version, v2 capability of device is completely ignored. > This breaks the iommu_v2 use-case, as it needs a direct mapping for the > devices that support it. And from Documentation/admin-guide/kernel-parameters.txt: This option does not override iommu=pt Fixes: aafd8ba0ca74 ("iommu/amd: Implement add_device and remove_device") Signed-off-by: Yu Zhao <yuzhao@google.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
* iommu/vt-d: Fix memory leak in intel_iommu_put_resv_regions()Gerald Schaefer2019-02-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | commit 198bc3252ea3a45b0c5d500e6a5b91cfdd08f001 upstream. Commit 9d3a4de4cb8d ("iommu: Disambiguate MSI region types") changed the reserved region type in intel_iommu_get_resv_regions() from IOMMU_RESV_RESERVED to IOMMU_RESV_MSI, but it forgot to also change the type in intel_iommu_put_resv_regions(). This leads to a memory leak, because now the check in intel_iommu_put_resv_regions() for IOMMU_RESV_RESERVED will never be true, and no allocated regions will be freed. Fix this by changing the region type in intel_iommu_put_resv_regions() to IOMMU_RESV_MSI, matching the type of the allocated regions. Fixes: 9d3a4de4cb8d ("iommu: Disambiguate MSI region types") Cc: <stable@vger.kernel.org> # v4.11+ Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* iommu/vt-d: Handle domain agaw being less than iommu agawSohil Mehta2019-01-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 3569dd07aaad71920c5ea4da2d5cc9a167c1ffd4 upstream. The Intel IOMMU driver opportunistically skips a few top level page tables from the domain paging directory while programming the IOMMU context entry. However there is an implicit assumption in the code that domain's adjusted guest address width (agaw) would always be greater than IOMMU's agaw. The IOMMU capabilities in an upcoming platform cause the domain's agaw to be lower than IOMMU's agaw. The issue is seen when the IOMMU supports both 4-level and 5-level paging. The domain builds a 4-level page table based on agaw of 2. However the IOMMU's agaw is set as 3 (5-level). In this case the code incorrectly tries to skip page page table levels. This causes the IOMMU driver to avoid programming the context entry. The fix handles this case and programs the context entry accordingly. Fixes: de24e55395698 ("iommu/vt-d: Simplify domain_context_mapping_one") Cc: <stable@vger.kernel.org> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reported-by: Ramos Falcon, Ernesto R <ernesto.r.ramos.falcon@intel.com> Tested-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* iommu/arm-smmu-v3: Fix big-endian CMD_SYNC writesRobin Murphy2019-01-091-1/+7
| | | | | | | | | | | | | | | | | | | | | | | commit 3cd508a8c1379427afb5e16c2e0a7c986d907853 upstream. When we insert the sync sequence number into the CMD_SYNC.MSIData field, we do so in CPU-native byte order, before writing out the whole command as explicitly little-endian dwords. Thus on big-endian systems, the SMMU will receive and write back a byteswapped version of sync_nr, which would be perfect if it were targeting a similarly-little-endian ITS, but since it's actually writing back to memory being polled by the CPUs, they're going to end up seeing the wrong thing. Since the SMMU doesn't care what the MSIData actually contains, the minimal-overhead solution is to simply add an extra byteswap initially, such that it then writes back the big-endian format directly. Cc: <stable@vger.kernel.org> Fixes: 37de98f8f1cf ("iommu/arm-smmu-v3: Use CMD_SYNC completion MSI") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* iommu/vt-d: Use memunmap to free memremapPan Bian2018-12-131-1/+1
| | | | | | | | | | | | [ Upstream commit 829383e183728dec7ed9150b949cd6de64127809 ] memunmap() should be used to free the return of memremap(), not iounmap(). Fixes: dfddb969edf0 ('iommu/vt-d: Switch from ioremap_cache to memremap') Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
* amd/iommu: Fix Guest Virtual APIC Log Tail Address RegisterFilippo Sironi2018-12-131-1/+2
| | | | | | | | | | | | | | | [ Upstream commit ab99be4683d9db33b100497d463274ebd23bd67e ] This register should have been programmed with the physical address of the memory location containing the shadow tail pointer for the guest virtual APIC log instead of the base address. Fixes: 8bda0cfbdc1a ('iommu/amd: Detect and initialize guest vAPIC log') Signed-off-by: Filippo Sironi <sironi@amazon.de> Signed-off-by: Wei Wang <wawei@amazon.de> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
* iommu/ipmmu-vmsa: Fix crash on early domain freeGeert Uytterhoeven2018-12-131-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit e5b78f2e349eef5d4fca5dc1cf5a3b4b2cc27abd ] If iommu_ops.add_device() fails, iommu_ops.domain_free() is still called, leading to a crash, as the domain was only partially initialized: ipmmu-vmsa e67b0000.mmu: Cannot accommodate DMA translation for IOMMU page tables sata_rcar ee300000.sata: Unable to initialize IPMMU context iommu: Failed to add device ee300000.sata to group 0: -22 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038 ... Call trace: ipmmu_domain_free+0x1c/0xa0 iommu_group_release+0x48/0x68 kobject_put+0x74/0xe8 kobject_del.part.0+0x3c/0x50 kobject_put+0x60/0xe8 iommu_group_get_for_dev+0xa8/0x1f0 ipmmu_add_device+0x1c/0x40 of_iommu_configure+0x118/0x190 Fix this by checking if the domain's context already exists, before trying to destroy it. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Fixes: d25a2a16f0889 ('iommu: Add driver for Renesas VMSA-compatible IPMMU') Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
* iommu/vt-d: Fix NULL pointer dereference in prq_event_thread()Lu Baolu2018-12-131-1/+1
| | | | | | | | | | | | | | | | [ Upstream commit 19ed3e2dd8549c1a34914e8dad01b64e7837645a ] When handling page request without pasid event, go to "no_pasid" branch instead of "bad_req". Otherwise, a NULL pointer deference will happen there. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Fixes: a222a7f0bb6c9 'iommu/vt-d: Implement page request handling' Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
* iommu/arm-smmu: Ensure that page-table updates are visible before TLBIWill Deacon2018-11-131-0/+6
| | | | | | | | | | | | | | | | | | | | commit 7d321bd3542500caf125249f44dc37cb4e738013 upstream. The IO-pgtable code relies on the driver TLB invalidation callbacks to ensure that all page-table updates are visible to the IOMMU page-table walker. In the case that the page-table walker is cache-coherent, we cannot rely on an implicit DSB from the DMA-mapping code, so we must ensure that we execute a DSB in our tlb_add_flush() callback prior to triggering the invalidation. Cc: <stable@vger.kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Fixes: 2df7a25ce4a7 ("iommu/arm-smmu: Clean up DMA API usage") Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* iommu/amd: Clear memory encryption mask from physical addressSingh, Brijesh2018-10-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Boris Ostrovsky reported a memory leak with device passthrough when SME is active. The VFIO driver uses iommu_iova_to_phys() to get the physical address for an iova. This physical address is later passed into vfio_unmap_unpin() to unpin the memory. The vfio_unmap_unpin() uses pfn_valid() before unpinning the memory. The pfn_valid() check was failing because encryption mask was part of the physical address returned. This resulted in the memory not being unpinned and therefore leaked after the guest terminates. The memory encryption mask must be cleared from the physical address in iommu_iova_to_phys(). Fixes: 2543a786aa25 ("iommu/amd: Allow the AMD IOMMU to work with memory encryption") Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: <iommu@lists.linux-foundation.org> Cc: Borislav Petkov <bp@suse.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: kvm@vger.kernel.org Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: <stable@vger.kernel.org> # 4.14+ Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
* iommu/amd: Return devid as alias for ACPI HID devicesArindam Nath2018-09-261-0/+6
| | | | | | | | | | | | | ACPI HID devices do not actually have an alias for them in the IVRS. But dev_data->alias is still used for indexing into the IOMMU device table for devices being handled by the IOMMU. So for ACPI HID devices, we simply return the corresponding devid as an alias, as parsed from IVRS table. Signed-off-by: Arindam Nath <arindam.nath@amd.com> Fixes: 2bf9a0a12749 ('iommu/amd: Add iommu support for ACPI HID devices') Signed-off-by: Joerg Roedel <jroedel@suse.de>
* iommu/vt-d: Handle memory shortage on pasid table allocationLu Baolu2018-09-252-4/+4
| | | | | | | | | | | | | | | | | | | | Pasid table memory allocation could return failure due to memory shortage. Limit the pasid table size to 1MiB because current 8MiB contiguous physical memory allocation can be hard to come by. W/o a PASID table, the device could continue to work with only shared virtual memory impacted. So, let's go ahead with context mapping even the memory allocation for pasid table failed. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107783 Fixes: cc580e41260d ("iommu/vt-d: Per PCI device pasid table interfaces") Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Reported-and-tested-by: Pelton Kyle D <kyle.d.pelton@intel.com> Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
* iommu/rockchip: Free irqs in shutdown handlerHeiko Stuebner2018-09-251-0/+6
| | | | | | | | | | | | | | | | | In the iommu's shutdown handler we disable runtime-pm which could result in the irq-handler running unclocked and since commit 3fc7c5c0cff3 ("iommu/rockchip: Handle errors returned from PM framework") we warn about that fact. This can cause warnings on shutdown on some Rockchip machines, so free the irqs in the shutdown handler before we disable runtime-pm. Reported-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Fixes: 3fc7c5c0cff3 ("iommu/rockchip: Handle errors returned from PM framework") Signed-off-by: Heiko Stuebner <heiko@sntech.de> Tested-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
* Merge tag 'armsoc-late' of ↵Linus Torvalds2018-08-251-17/+28
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC late updates from Olof Johansson: "A couple of late-merged changes that would be useful to get in this merge window: - Driver support for reset of audio complex on Meson platforms. The audio driver went in this merge window, and these changes have been in -next for a while (just not in our tree). - Power management fixes for IOMMU on Rockchip platforms, getting closer to kexec working on them, including Chromebooks. - Another pass updating "arm,psci" -> "psci" for some properties that have snuck in since last time it was done" * tag 'armsoc-late' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: iommu/rockchip: Move irq request past pm_runtime_enable iommu/rockchip: Handle errors returned from PM framework arm64: rockchip: Force CONFIG_PM on Rockchip systems ARM: rockchip: Force CONFIG_PM on Rockchip systems arm64: dts: Fix various entry-method properties to reflect documentation reset: imx7: Fix always writing bits as 0 reset: meson: add meson audio arb driver reset: meson: add dt-bindings for meson-axg audio arb
| * iommu/rockchip: Move irq request past pm_runtime_enableMarc Zyngier2018-08-241-11/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enabling the interrupt early, before power has been applied to the device, can result in an interrupt being delivered too early if: - the IOMMU shares an interrupt with a VOP - the VOP has a pending interrupt (after a kexec, for example) In these conditions, we end-up taking the interrupt without the IOMMU being ready to handle the interrupt (not powered on). Moving the interrupt request past the pm_runtime_enable() call makes sure we can at least access the IOMMU registers. Note that this is only a partial fix, and that the VOP interrupt will still be screaming until the VOP driver kicks in, which advocates for a more synchronized interrupt enabling/disabling approach. Fixes: 0f181d3cf7d98 ("iommu/rockchip: Add runtime PM support") Reviewed-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Olof Johansson <olof@lixom.net>
| * iommu/rockchip: Handle errors returned from PM frameworkMarc Zyngier2018-08-241-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pm_runtime_get_if_in_use can fail: either PM has been disabled altogether (-EINVAL), or the device hasn't been enabled yet (0). Sadly, the Rockchip IOMMU driver tends to conflate the two things by considering a non-zero return value as successful. This has the consequence of hiding other bugs, so let's handle this case throughout the driver, with a WARN_ON_ONCE so that we can try and work out what happened. Fixes: 0f181d3cf7d98 ("iommu/rockchip: Add runtime PM support") Reviewed-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Olof Johansson <olof@lixom.net>
* | Merge tag 'iommu-updates-v4.19' of ↵Linus Torvalds2018-08-2428-232/+739
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: - PASID table handling updates for the Intel VT-d driver. It implements a global PASID space now so that applications usings multiple devices will just have one PASID. - A new config option to make iommu passthroug mode the default. - New sysfs attribute for iommu groups to export the type of the default domain. - A debugfs interface (for debug only) usable by IOMMU drivers to export internals to user-space. - R-Car Gen3 SoCs support for the ipmmu-vmsa driver - The ARM-SMMU now aborts transactions from unknown devices and devices not attached to any domain. - Various cleanups and smaller fixes all over the place. * tag 'iommu-updates-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (42 commits) iommu/omap: Fix cache flushes on L2 table entries iommu: Remove the ->map_sg indirection iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel iommu/arm-smmu-v3: Prevent any devices access to memory without registration iommu/ipmmu-vmsa: Don't register as BUS IOMMU if machine doesn't have IPMMU-VMSA iommu/ipmmu-vmsa: Clarify supported platforms iommu/ipmmu-vmsa: Fix allocation in atomic context iommu: Add config option to set passthrough as default iommu: Add sysfs attribyte for domain type iommu/arm-smmu-v3: sync the OVACKFLG to PRIQ consumer register iommu/arm-smmu: Error out only if not enough context interrupts iommu/io-pgtable-arm-v7s: Abort allocation when table address overflows the PTE iommu/io-pgtable-arm: Fix pgtable allocation in selftest iommu/vt-d: Remove the obsolete per iommu pasid tables iommu/vt-d: Apply per pci device pasid table in SVA iommu/vt-d: Allocate and free pasid table iommu/vt-d: Per PCI device pasid table interfaces iommu/vt-d: Add for_each_device_domain() helper iommu/vt-d: Move device_domain_info to header iommu/vt-d: Apply global PASID in SVA ...
| | \
| | \
| | \
| | \
| | \
| | \
| | \
| | \
| | \
| | \
| | \
| | \
| *-----------. \ Merge branches 'arm/shmobile', 'arm/renesas', 'arm/msm', 'arm/smmu', ↵Joerg Roedel2018-08-0828-251/+819
| |\ \ \ \ \ \ \ \ | | |_|_|_|_|_|_|/ | |/| | | | | | | | | | | | | | | | 'arm/omap', 'x86/amd', 'x86/vt-d' and 'core' into next
| | | | | | | | * iommu: Remove the ->map_sg indirectionChristoph Hellwig2018-08-0815-17/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All iommu drivers use the default_iommu_map_sg implementation, and there is no good reason to ever override it. Just expose it as iommu_map_sg directly and remove the indirection, specially in our post-spectre world where indirect calls are horribly expensive. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | * iommu: Add config option to set passthrough as defaultOlof Johansson2018-07-272-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows the default behavior to be controlled by a kernel config option instead of changing the commandline for the kernel to include "iommu.passthrough=on" or "iommu=pt" on machines where this is desired. Likewise, for machines where this config option is enabled, it can be disabled at boot time with "iommu.passthrough=off" or "iommu=nopt". Also corrected iommu=pt documentation for IA-64, since it has no code that parses iommu= at all. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | * iommu: Add sysfs attribyte for domain typeOlof Johansson2018-07-271-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While we could print it at setup time, this is an easier way to match each device to their default IOMMU allocation type. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | * iommu/amd: Add basic debugfs infrastructure for AMD IOMMUGary R Hook2018-07-066-2/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement a skeleton framework for debugfs support in the AMD IOMMU. Add an AMD-specific Kconfig boolean that depends upon general enablement of DebugFS in the IOMMU. Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | | * iommu: Enable debugfs exposure of IOMMU driver internalsGary R Hook2018-07-064-0/+79
| | |_|_|_|_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide base enablement for using debugfs to expose internal data of an IOMMU driver. When called, create the /sys/kernel/debug/iommu directory. Emit a strong warning at boot time to indicate that this feature is enabled. This function is called from iommu_init, and creates the initial DebugFS directory. Drivers may then call iommu_debugfs_new_driver_dir() to instantiate a device-specific directory to expose internal data. It will return a pointer to the new dentry structure created in /sys/kernel/debug/iommu, or NULL in the event of a failure. Since the IOMMU driver can not be removed from the running system, there is no need for an "off" function. Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * iommu/vt-d: Remove the obsolete per iommu pasid tablesLu Baolu2018-07-202-18/+5Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The obsolete per iommu pasid tables are no longer used. Hence, clean up them. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * iommu/vt-d: Apply per pci device pasid table in SVALu Baolu2018-07-202-32/+19Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch applies the per pci device pasid table in the Shared Virtual Address (SVA) implementation. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * iommu/vt-d: Allocate and free pasid tableLu Baolu2018-07-201-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allocates a PASID table for a PCI device at the time when the dmar dev_info is attached to dev->archdata.iommu, and free it in the opposite case. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * iommu/vt-d: Per PCI device pasid table interfacesLu Baolu2018-07-204-4/+198
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the interfaces for per PCI device pasid table management. Currently we allocate one pasid table for all PCI devices under the scope of an IOMMU. It's insecure in some cases where multiple devices under one single IOMMU unit support PASID features. With per PCI device pasid table, we can achieve finer protection and isolation granularity. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Liu Yi L <yi.l.liu@intel.com> Suggested-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * iommu/vt-d: Add for_each_device_domain() helperLu Baolu2018-07-201-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a helper named for_each_device_domain() to iterate over the elements in device_domain_list and invoke a callback against each element. This allows to search the device_domain list in other source files. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * iommu/vt-d: Move device_domain_info to headerLu Baolu2018-07-201-59/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows the per device iommu data and some helpers to be used in other files. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * iommu/vt-d: Apply global PASID in SVALu Baolu2018-07-201-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch applies the global pasid name space in the shared virtual address (SVA) implementation. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Liu Yi L <yi.l.liu@intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | | | | | | * iommu/vt-d: Avoid using idr_for_each_entry()Lu Baolu2018-07-201-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | idr_for_each_entry() is used to iteratte over idr elements of a given type. It isn't suitable for the globle pasid idr since the pasid idr consumer could specify different types of pointers to bind with a pasid. Cc: Ashok Raj <ashok.raj@intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Liu Yi L <yi.l.liu@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Liu Yi L <yi.l.liu@intel.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>