summaryrefslogtreecommitdiffstats
path: root/drivers/iommu/arm-smmu-v3.c
Commit message (Collapse)AuthorAgeFilesLines
* iommu/arm-smmu: Mark expected switch fall-throughAnders Roxell2019-08-061-2/+2
| | | | | | | | | | | | | | | | | | | | | | Now that -Wimplicit-fallthrough is passed to GCC by default, the following warning shows up: ../drivers/iommu/arm-smmu-v3.c: In function ‘arm_smmu_write_strtab_ent’: ../drivers/iommu/arm-smmu-v3.c:1189:7: warning: this statement may fall through [-Wimplicit-fallthrough=] if (disable_bypass) ^ ../drivers/iommu/arm-smmu-v3.c:1191:3: note: here default: ^~~~~~~ Rework so that the compiler doesn't warn about fall-through. Make it clearer by calling 'BUG_ON()' when disable_bypass is set, and always 'break;' Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>
* Merge tag 'driver-core-5.3-rc1' of ↵Linus Torvalds2019-07-121-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core and debugfs updates from Greg KH: "Here is the "big" driver core and debugfs changes for 5.3-rc1 It's a lot of different patches, all across the tree due to some api changes and lots of debugfs cleanups. Other than the debugfs cleanups, in this set of changes we have: - bus iteration function cleanups - scripts/get_abi.pl tool to display and parse Documentation/ABI entries in a simple way - cleanups to Documenatation/ABI/ entries to make them parse easier due to typos and other minor things - default_attrs use for some ktype users - driver model documentation file conversions to .rst - compressed firmware file loading - deferred probe fixes All of these have been in linux-next for a while, with a bunch of merge issues that Stephen has been patient with me for" * tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits) debugfs: make error message a bit more verbose orangefs: fix build warning from debugfs cleanup patch ubifs: fix build warning after debugfs cleanup patch driver: core: Allow subsystems to continue deferring probe drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT arch_topology: Remove error messages on out-of-memory conditions lib: notifier-error-inject: no need to check return value of debugfs_create functions swiotlb: no need to check return value of debugfs_create functions ceph: no need to check return value of debugfs_create functions sunrpc: no need to check return value of debugfs_create functions ubifs: no need to check return value of debugfs_create functions orangefs: no need to check return value of debugfs_create functions nfsd: no need to check return value of debugfs_create functions lib: 842: no need to check return value of debugfs_create functions debugfs: provide pr_fmt() macro debugfs: log errors when something goes wrong drivers: s390/cio: Fix compilation warning about const qualifiers drivers: Add generic helper to match by of_node driver_find_device: Unify the match function with class_find_device() bus_find_device: Unify the match callback with class_find_device ...
| * driver_find_device: Unify the match function with class_find_device()Suzuki K Poulose2019-06-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The driver_find_device() accepts a match function pointer to filter the devices for lookup, similar to bus/class_find_device(). However, there is a minor difference in the prototype for the match parameter for driver_find_device() with the now unified version accepted by {bus/class}_find_device(), where it doesn't accept a "const" qualifier for the data argument. This prevents us from reusing the generic match functions for driver_find_device(). For this reason, change the prototype of the driver_find_device() to make the "match" parameter in line with {bus/class}_find_device() and adjust its callers to use the const qualifier. Also, we could now promote the "data" parameter to const as we pass it down as a const parameter to the match functions. Cc: Corey Minyard <minyard@acm.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: Sebastian Ott <sebott@linux.ibm.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Nehal Shah <nehal-bakulchandra.shah@amd.com> Cc: Shyam Sundar S K <shyam-sundar.s-k@amd.com> Cc: Lee Jones <lee.jones@linaro.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | iommu/arm-smmu-v3: Invalidate ATC when detaching a deviceJean-Philippe Brucker2019-07-041-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We make the invalid assumption in arm_smmu_detach_dev() that the ATC is clear after calling pci_disable_ats(). For one thing, only enabling the PCIe ATS capability constitutes an implicit invalidation event, so the comment was wrong. More importantly, the ATS capability isn't necessarily disabled by pci_disable_ats() in a PF, if the associated VFs have ATS enabled. Explicitly invalidate all ATC entries in arm_smmu_detach_dev(). The endpoint cannot form new ATC entries because STE.EATS is clear. Fixes: 9ce27afc0830 ("iommu/arm-smmu-v3: Add support for PCI ATS") Reported-by: Manoj Kumar <Manoj.Kumar3@arm.com> Reported-by: Robin Murphy <Robin.Murphy@arm.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>
* | iommu/arm-smmu-v3: Fix compilation when CONFIG_CMA=nWill Deacon2019-07-021-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When compiling a kernel without support for CMA, CONFIG_CMA_ALIGNMENT is not defined which results in the following build failure: In file included from ./include/linux/list.h:9:0 from ./include/linux/kobject.h:19, from ./include/linux/of.h:17 from ./include/linux/irqdomain.h:35, from ./include/linux/acpi.h:13, from drivers/iommu/arm-smmu-v3.c:12: drivers/iommu/arm-smmu-v3.c: In function ‘arm_smmu_device_hw_probe’: drivers/iommu/arm-smmu-v3.c:194:40: error: ‘CONFIG_CMA_ALIGNMENT’ undeclared (first use in this function) #define Q_MAX_SZ_SHIFT (PAGE_SHIFT + CONFIG_CMA_ALIGNMENT) Fix the breakage by capping the maximum queue size based on MAX_ORDER when CMA is not enabled. Reported-by: Zhangshaokun <zhangshaokun@hisilicon.com> Signed-off-by: Will Deacon <will@kernel.org> Tested-by: Shaokun Zhang <zhangshaokun@hisilicon.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
* | iommu/io-pgtable: Replace IO_PGTABLE_QUIRK_NO_DMA with specific flagWill Deacon2019-06-251-3/+1Star
| | | | | | | | | | | | | | | | | | | | | | IO_PGTABLE_QUIRK_NO_DMA is a bit of a misnomer, since it's really just an indication of whether or not the page-table walker for the IOMMU is coherent with the CPU caches. Since cache coherency is more than just a quirk, replace the flag with its own field in the io_pgtable_cfg structure. Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Will Deacon <will@kernel.org>
* | iommu/arm-smmu-v3: Increase maximum size of queuesWill Deacon2019-06-181-16/+38
|/ | | | | | | | | | | | | | We've been artificially limiting the size of our queues to 4k so that we don't end up allocating huge amounts of physically-contiguous memory at probe time. However, 4k is only enough for 256 commands in the command queue, so instead let's try to allocate the largest queue that the SMMU supports, retrying with a smaller size if the allocation fails. The caveat here is that we have to limit our upper bound based on CONFIG_CMA_ALIGNMENT to ensure that our queue allocations remain natually aligned, which is required by the SMMU architecture. Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu-v3: Don't disable SMMU in kdump kernelWill Deacon2019-04-231-6/+4Star
| | | | | | | | | | | | | | | | | | Disabling the SMMU when probing from within a kdump kernel so that all incoming transactions are terminated can prevent the core of the crashed kernel from being transferred off the machine if all I/O devices are behind the SMMU. Instead, continue to probe the SMMU after it is disabled so that we can reinitialise it entirely and re-attach the DMA masters as they are reset. Since the kdump kernel may not have drivers for all of the active DMA masters, we suppress fault reporting to avoid spamming the console and swamping the IRQ threads. Reported-by: "Leizhen (ThunderTown)" <thunder.leizhen@huawei.com> Tested-by: "Leizhen (ThunderTown)" <thunder.leizhen@huawei.com> Tested-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu-v3: Disable tagged pointersJean-Philippe Brucker2019-04-231-1/+0Star
| | | | | | | | | | | | | | | | | | | | | The ARM architecture has a "Top Byte Ignore" (TBI) option that makes the MMU mask out bits [63:56] of an address, allowing a userspace application to store data in its pointers. This option is incompatible with PCI ATS. If TBI is enabled in the SMMU and userspace triggers DMA transactions on tagged pointers, the endpoint might create ATC entries for addresses that include a tag. Software would then have to send ATC invalidation packets for each 255 possible alias of an address, or just wipe the whole address space. This is not a viable option, so disable TBI. The impact of this change is unclear, since there are very few users of tagged pointers, much less SVA. But the requirement introduced by this patch doesn't seem excessive: a userspace application using both tagged pointers and SVA should now sanitize addresses (clear the tag) before using them for device DMA. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu-v3: Add support for PCI ATSJean-Philippe Brucker2019-04-231-6/+195
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PCIe devices can implement their own TLB, named Address Translation Cache (ATC). Enable Address Translation Service (ATS) for devices that support it and send them invalidation requests whenever we invalidate the IOTLBs. ATC invalidation is allowed to take up to 90 seconds, according to the PCIe spec, so it is possible to get a SMMU command queue timeout during normal operations. However we expect implementations to complete invalidation in reasonable time. We only enable ATS for "trusted" devices, and currently rely on the pci_dev->untrusted bit. For ATS we have to trust that: (a) The device doesn't issue "translated" memory requests for addresses that weren't returned by the SMMU in a Translation Completion. In particular, if we give control of a device or device partition to a VM or userspace, software cannot program the device to access arbitrary "translated" addresses. (b) The device follows permissions granted by the SMMU in a Translation Completion. If the device requested read+write permission and only got read, then it doesn't write. (c) The device doesn't send Translated transactions for an address that was invalidated by an ATC invalidation. Note that the PCIe specification explicitly requires all of these, so we can assume that implementations will cleanly shield ATCs from software. All ATS translated requests still go through the SMMU, to walk the stream table and check that the device is actually allowed to send translated requests. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu-v3: Link domains and devicesJean-Philippe Brucker2019-04-231-1/+20
| | | | | | | | | | | | | | | | | When removing a mapping from a domain, we need to send an invalidation to all devices that might have stored it in their Address Translation Cache (ATC). In addition when updating the context descriptor of a live domain, we'll need to send invalidations for all devices attached to it. Maintain a list of devices in each domain, protected by a spinlock. It is updated every time we attach or detach devices to and from domains. It needs to be a spinlock because we'll invalidate ATC entries from within hardirq-safe contexts, but it may be possible to relax the read side with RCU later. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu-v3: Add a master->domain pointerJean-Philippe Brucker2019-04-231-47/+45Star
| | | | | | | | | | As we're going to track domain-master links more closely for ATS and CD invalidation, add pointer to the attached domain in struct arm_smmu_master. As a result, arm_smmu_strtab_ent is redundant and can be removed. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu-v3: Store SteamIDs in masterJean-Philippe Brucker2019-04-231-15/+15
| | | | | | | | | Simplify the attach/detach code a bit by keeping a pointer to the stream IDs in the master structure. Although not completely obvious here, it does make the subsequent support for ATS, PRI and PASID a bit simpler. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu-v3: Rename arm_smmu_master_data to arm_smmu_masterJean-Philippe Brucker2019-04-231-6/+6
| | | | | | | | | | The arm_smmu_master_data structure already represents more than just the firmware data associated to a master, and will be used extensively to represent a device's state when implementing more SMMU features. Rename the structure to arm_smmu_master. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu: Allow io-pgtable to be used outside of drivers/iommu/Rob Herring2019-02-111-2/+1Star
| | | | | | | | | | | | | | | | | | | Move io-pgtable.h to include/linux/ and export alloc_io_pgtable_ops and free_io_pgtable_ops. This enables drivers outside drivers/iommu/ to use the page table library. Specifically, some ARM Mali GPUs use the ARM page table formats. Cc: Will Deacon <will.deacon@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Rob Clark <robdclark@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Cc: iommu@lists.linux-foundation.org Cc: linux-mediatek@lists.infradead.org Cc: linux-arm-msm@vger.kernel.org Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>
*-. Merge branches 'iommu/fixes', 'arm/renesas', 'arm/mediatek', 'arm/tegra', ↵Joerg Roedel2018-12-201-26/+37
|\ \ | | | | | | | | | 'arm/omap', 'arm/smmu', 'x86/vt-d', 'x86/amd' and 'core' into next
| | * iommu/arm-smmu: Use helper functions to access dev->iommu_fwspecJoerg Roedel2018-12-171-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the new helpers dev_iommu_fwspec_get()/set() to access the dev->iommu_fwspec pointer. This makes it easier to move that pointer later into another struct. Cc: Will Deacon <will.deacon@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | * iommu/arm-smmu: Make arm-smmu-v3 explicitly non-modularPaul Gortmaker2018-12-031-16/+9Star
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Kconfig currently controlling compilation of this code is: drivers/iommu/Kconfig:config ARM_SMMU_V3 drivers/iommu/Kconfig: bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support" ...meaning that it currently is not being built as a module by anyone. Lets remove the modular code that is essentially orphaned, so that when reading the driver there is no doubt it is builtin-only. Since module_platform_driver() uses the same init level priority as builtin_platform_driver() the init ordering remains unchanged with this commit. We explicitly disallow a driver unbind, since that doesn't have a sensible use case anyway, but unlike most drivers, we can't delete the function tied to the ".remove" field. This is because as of commit 7aa8619a66ae ("iommu/arm-smmu-v3: Implement shutdown method") the .remove function was given a one line wrapper and re-used to provide a .shutdown service. So we delete the wrapper and re-name the function from remove to shutdown. We add a moduleparam.h include since the file does actually declare some module parameters, and leaving them as such is the easiest way currently to remain backwards compatible with existing use cases. Also note that MODULE_DEVICE_TABLE is a no-op for non-modular code. We also delete the MODULE_LICENSE tag etc. since all that information is already contained at the top of the file in the comments. Cc: Will Deacon <will.deacon@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Nate Watterson <nwatters@codeaurora.org> Cc: linux-arm-kernel@lists.infradead.org Cc: iommu@lists.linux-foundation.org Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| * iommu/arm-smmu-v3: Use explicit mb() when moving cons pointerWill Deacon2018-12-101-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After removing an entry from a queue (e.g. reading an event in arm_smmu_evtq_thread()) it is necessary to advance the MMIO consumer pointer to free the queue slot back to the SMMU. A memory barrier is required here so that all reads targetting the queue entry have completed before the consumer pointer is updated. The implementation of queue_inc_cons() relies on a writel() to complete the previous reads, but this is incorrect because writel() is only guaranteed to complete prior writes. This patch replaces the call to writel() with an mb(); writel_relaxed() sequence, which gives us the read->write ordering which we require. Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * iommu/arm-smmu-v3: Avoid memory corruption from Hisilicon MSI payloadsZhen Lei2018-12-101-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The GITS_TRANSLATER MMIO doorbell register in the ITS hardware is architected to be 4 bytes in size, yet on hi1620 and earlier, Hisilicon have allocated the adjacent 4 bytes to carry some IMPDEF sideband information which results in an 8-byte MSI payload being delivered when signalling an interrupt: MSIAddr: |----4bytes----|----4bytes----| | MSIData | IMPDEF | This poses no problem for the ITS hardware because the adjacent 4 bytes are reserved in the memory map. However, when delivering MSIs to memory, as we do in the SMMUv3 driver for signalling the completion of a SYNC command, the extended payload will corrupt the 4 bytes adjacent to the "sync_count" member in struct arm_smmu_device. Fortunately, the current layout allocates these bytes to padding, but this is fragile and we should make this explicit. Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> [will: Rewrote commit message and comment] Signed-off-by: Will Deacon <will.deacon@arm.com>
| * iommu/arm-smmu-v3: Fix big-endian CMD_SYNC writesRobin Murphy2018-12-101-1/+7
|/ | | | | | | | | | | | | | | | | | | When we insert the sync sequence number into the CMD_SYNC.MSIData field, we do so in CPU-native byte order, before writing out the whole command as explicitly little-endian dwords. Thus on big-endian systems, the SMMU will receive and write back a byteswapped version of sync_nr, which would be perfect if it were targeting a similarly-little-endian ITS, but since it's actually writing back to memory being polled by the CPUs, they're going to end up seeing the wrong thing. Since the SMMU doesn't care what the MSIData actually contains, the minimal-overhead solution is to simply add an extra byteswap initially, such that it then writes back the big-endian format directly. Cc: <stable@vger.kernel.org> Fixes: 37de98f8f1cf ("iommu/arm-smmu-v3: Use CMD_SYNC completion MSI") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu-v3: Remove unnecessary wrapper functionAndrew Murray2018-10-101-8/+4Star
| | | | | | | | | | | Simplify the code by removing an unnecessary wrapper function. This was left behind by commit 2f657add07a8 ("iommu/arm-smmu-v3: Specialise CMD_SYNC handling") Signed-off-by: Andrew Murray <andrew.murray@arm.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
* iommu/arm-smmu-v3: Add SPDX headerAndrew Murray2018-10-101-12/+1Star
| | | | | | | | Replace license text with SDPX header Signed-off-by: Andrew Murray <andrew.murray@arm.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
* iommu/arm-smmu-v3: Add support for non-strict modeZhen Lei2018-10-011-23/+56
| | | | | | | | | | | | Now that io-pgtable knows how to dodge strict TLB maintenance, all that's left to do is bridge the gap between the IOMMU core requesting DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE for default domains, and showing the appropriate IO_PGTABLE_QUIRK_NON_STRICT flag to alloc_io_pgtable_ops(). Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> [rm: convert to domain attribute, tweak commit message] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu-v3: Implement flush_iotlb_all hookZhen Lei2018-10-011-1/+9
| | | | | | | | | | | | | | | | | | | | | | | .flush_iotlb_all is currently stubbed to arm_smmu_iotlb_sync() since the only time it would ever need to actually do anything is for callers doing their own explicit batching, e.g.: iommu_unmap_fast(domain, ...); iommu_unmap_fast(domain, ...); iommu_iotlb_flush_all(domain, ...); where since io-pgtable still issues the TLBI commands implicitly in the unmap instead of implementing .iotlb_range_add, the "flush" only needs to ensure completion of those already-in-flight invalidations. However, we're about to start using it in anger with flush queues, so let's get a proper implementation wired up. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> [rm: document why it wasn't a bug] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu-v3: Avoid back-to-back CMD_SYNC operationsZhen Lei2018-10-011-3/+13
| | | | | | | | | | | | | | | | Putting adjacent CMD_SYNCs into the command queue is nonsensical, but can happen when multiple CPUs are inserting commands. Rather than leave the poor old hardware to chew through these operations, we can instead drop the subsequent SYNCs and poll for completion of the first. This has been shown to improve IO performance under pressure, where the number of SYNC operations reduces by about a third: CMD_SYNCs reduced: 19542181 CMD_SYNCs total: 58098548 (include reduced) CMDs total: 116197099 (TLBI:SYNC about 1:1) Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu-v3: Fix unexpected CMD_SYNC timeoutZhen Lei2018-10-011-5/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The condition break condition of: (int)(VAL - sync_idx) >= 0 in the __arm_smmu_sync_poll_msi() polling loop requires that sync_idx must be increased monotonically according to the sequence of the CMDs in the cmdq. However, since the msidata is populated using atomic_inc_return_relaxed() before taking the command-queue spinlock, then the following scenario can occur: CPU0 CPU1 msidata=0 msidata=1 insert cmd1 insert cmd0 smmu execute cmd1 smmu execute cmd0 poll timeout, because msidata=1 is overridden by cmd0, that means VAL=0, sync_idx=1. This is not a functional problem, since the caller will eventually either timeout or exit due to another CMD_SYNC, however it's clearly not what the code is supposed to be doing. Fix it, by incrementing the sequence count with the command-queue lock held, allowing us to drop the atomic operations altogether. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> [will: dropped the specialised cmd building routine for now] Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu-v3: Fix a couple of minor comment typosJohn Garry2018-10-011-3/+3
| | | | | | | Fix some comment typos spotted. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* Merge tag 'iommu-updates-v4.19' of ↵Linus Torvalds2018-08-241-8/+18
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: - PASID table handling updates for the Intel VT-d driver. It implements a global PASID space now so that applications usings multiple devices will just have one PASID. - A new config option to make iommu passthroug mode the default. - New sysfs attribute for iommu groups to export the type of the default domain. - A debugfs interface (for debug only) usable by IOMMU drivers to export internals to user-space. - R-Car Gen3 SoCs support for the ipmmu-vmsa driver - The ARM-SMMU now aborts transactions from unknown devices and devices not attached to any domain. - Various cleanups and smaller fixes all over the place. * tag 'iommu-updates-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (42 commits) iommu/omap: Fix cache flushes on L2 table entries iommu: Remove the ->map_sg indirection iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel iommu/arm-smmu-v3: Prevent any devices access to memory without registration iommu/ipmmu-vmsa: Don't register as BUS IOMMU if machine doesn't have IPMMU-VMSA iommu/ipmmu-vmsa: Clarify supported platforms iommu/ipmmu-vmsa: Fix allocation in atomic context iommu: Add config option to set passthrough as default iommu: Add sysfs attribyte for domain type iommu/arm-smmu-v3: sync the OVACKFLG to PRIQ consumer register iommu/arm-smmu: Error out only if not enough context interrupts iommu/io-pgtable-arm-v7s: Abort allocation when table address overflows the PTE iommu/io-pgtable-arm: Fix pgtable allocation in selftest iommu/vt-d: Remove the obsolete per iommu pasid tables iommu/vt-d: Apply per pci device pasid table in SVA iommu/vt-d: Allocate and free pasid table iommu/vt-d: Per PCI device pasid table interfaces iommu/vt-d: Add for_each_device_domain() helper iommu/vt-d: Move device_domain_info to header iommu/vt-d: Apply global PASID in SVA ...
| *-. Merge branches 'arm/shmobile', 'arm/renesas', 'arm/msm', 'arm/smmu', ↵Joerg Roedel2018-08-081-8/+18
| |\ \ | | | | | | | | | | | | 'arm/omap', 'x86/amd', 'x86/vt-d' and 'core' into next
| | | * iommu: Remove the ->map_sg indirectionChristoph Hellwig2018-08-081-1/+0Star
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | All iommu drivers use the default_iommu_map_sg implementation, and there is no good reason to ever override it. Just expose it as iommu_map_sg directly and remove the indirection, specially in our post-spectre world where indirect calls are horribly expensive. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| | * iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernelWill Deacon2018-07-271-6/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we find that the SMMU is enabled during probe, we reset it by re-initialising its registers and either enabling translation or placing it into bypass based on the disable_bypass commandline option. In the case of a kdump kernel, the SMMU won't have been shutdown cleanly by the previous kernel and there may be concurrent DMA through the SMMU. Rather than reset the SMMU to bypass, which would likely lead to rampant data corruption, we can instead configure the SMMU to abort all incoming transactions when we find that it is enabled from within a kdump kernel. Reported-by: Sameer Goel <sgoel@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
| | * iommu/arm-smmu-v3: Prevent any devices access to memory without registrationZhen Lei2018-07-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stream bypass is a potential security hole since a malicious device can be hotplugged in without matching any drivers, yet be granted the ability to access all of physical memory. Now that we attach devices to domains by default, we can toggle the disable_bypass default to "on", preventing DMA from unknown devices. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| | * iommu/arm-smmu-v3: sync the OVACKFLG to PRIQ consumer registerMiao Zhong2018-07-261-0/+1
| |/ | | | | | | | | | | | | | | | | | | | | When PRI queue occurs overflow, driver should update the OVACKFLG to the PRIQ consumer register, otherwise subsequent PRI requests will not be processed. Cc: Will Deacon <will.deacon@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Miao Zhong <zhongmiao@hisilicon.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* / iommu: Remove IOMMU_OF_DECLARERob Herring2018-07-101-2/+0Star
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we use the driver core to stop deferred probe for missing drivers, IOMMU_OF_DECLARE can be removed. This is slightly less optimal than having a list of built-in drivers in that we'll now defer probe twice before giving up. This shouldn't have a significant impact on boot times as past discussions about deferred probe have given no evidence of deferred probe having a substantial impact. Cc: Robin Murphy <robin.murphy@arm.com> Cc: Kukjin Kim <kgene@kernel.org> Cc: Krzysztof Kozlowski <krzk@kernel.org> Cc: Rob Clark <robdclark@gmail.com> Cc: Heiko Stuebner <heiko@sntech.de> Cc: Frank Rowand <frowand.list@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Cc: iommu@lists.linux-foundation.org Cc: linux-samsung-soc@vger.kernel.org Cc: linux-arm-msm@vger.kernel.org Cc: linux-rockchip@lists.infradead.org Cc: devicetree@vger.kernel.org Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* iommu/arm-smmu-v3: Support 52-bit virtual addressRobin Murphy2018-03-271-1/+9
| | | | | | | | | Stage 1 input addresses are effectively 64-bit in SMMUv3 anyway, so really all that's involved is letting io-pgtable know the appropriate upper bound for T0SZ. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu-v3: Support 52-bit physical addressRobin Murphy2018-03-271-15/+20
| | | | | | | | | | Implement SMMUv3.1 support for 52-bit physical addresses. Since a 52-bit OAS implies 64KB translation granule support, permitting level 1 block entries there is simple, and the rest is just extending address fields. Tested-by: Nate Watterson <nwatters@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu-v3: Clean up queue definitionsRobin Murphy2018-03-271-73/+55Star
| | | | | | | | | | | | | As with registers and tables, use GENMASK and the bitfield accessors consistently for queue fields, to save some lines and ease maintenance a little. This now leaves everything in a nice state where all named field definitions expect to be used with bitfield accessors (although since single-bit fields can still be used directly we leave some of those uses as-is to avoid unnecessary churn), while the few remaining *_MASK definitions apply exclusively to in-place values. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu-v3: Clean up table definitionsRobin Murphy2018-03-271-90/+59Star
| | | | | | | | | As with registers, use GENMASK and the bitfield accessors consistently for table fields, to save some lines and ease maintenance a little. This also catches a subtle off-by-one wherein bit 5 of CD.T0SZ was missing. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu-v3: Clean up register definitionsRobin Murphy2018-03-271-97/+77Star
| | | | | | | | | | | | | | | The FIELD_{GET,PREP} accessors provided by linux/bitfield.h allow us to define multi-bit register fields solely in terms of their bit positions via GENMASK(), without needing explicit *_SHIFT and *_MASK definitions. As well as the immediate reduction in lines of code, this avoids the awkwardness of values sometimes being pre-shifted and sometimes not, which means we can factor out some common values like memory attributes. Furthermore, it also makes it trivial to verify the definitions against the architecture spec, on which note let's also fix up a few field names to properly match the current release (IHI0070B). Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu-v3: Clean up address maskingRobin Murphy2018-03-271-32/+21Star
| | | | | | | | | | | | Before trying to add the SMMUv3.1 support for 52-bit addresses, make things bearable by cleaning up the various address mask definitions to use GENMASK_ULL() consistently. The fact that doing so reveals (and fixes) a latent off-by-one in Q_BASE_ADDR_MASK only goes to show what a jolly good idea it is... Tested-by: Nate Watterson <nwatters@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu-v3: limit reporting of MSI allocation failuresNate Watterson2018-03-271-1/+6
| | | | | | | | | | | | | | | | Currently, the arm-smmu-v3 driver expects to allocate MSIs for all SMMUs with FEAT_MSI set. This results in unwarranted "failed to allocate MSIs" warnings being printed on systems where FW was either deliberately configured to force the use of SMMU wired interrupts -or- is altogether incapable of describing SMMU MSI topology (ACPI IORT prior to rev.C). Remedy this by checking msi_domain before attempting to allocate SMMU MSIs. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Nate Watterson <nwatters@codeaurora.org> Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
* iommu/arm-smmu-v3: Warn about missing IRQsRobin Murphy2018-03-271-0/+6
| | | | | | | | | | It is annoyingly non-obvious when DMA transactions silently go missing due to undetected SMMU faults. Help skip the first few debugging steps in those situations by making it clear when we have neither wired IRQs nor MSIs with which to raise error conditions. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
*-. Merge branches 'arm/renesas', 'arm/omap', 'arm/exynos', 'x86/amd', ↵Joerg Roedel2018-01-171-5/+14
|\ \ | | | | | | | | | 'x86/vt-d' and 'core' into next
| | * iommu: Clean up of_iommu_init_fnRobin Murphy2018-01-171-1/+1
| |/ |/| | | | | | | | | | | | | | | | | | | | | Now that no more drivers rely on arbitrary early initialisation via an of_iommu_init_fn hook, let's clean up the redundant remnants. The IOMMU_OF_DECLARE() macro needs to remain for now, as the probe-deferral mechanism has no other nice way to detect built-in drivers before they have registered themselves, such that it can make the right decision. Reviewed-by: Sricharan R <sricharan@codeaurora.org> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
| * iommu/arm-smmu-v3: Cope with duplicated Stream IDsRobin Murphy2018-01-021-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For PCI devices behind an aliasing PCIe-to-PCI/X bridge, the bridge alias to DevFn 0.0 on the subordinate bus may match the original RID of the device, resulting in the same SID being present in the device's fwspec twice. This causes trouble later in arm_smmu_write_strtab_ent() when we wind up visiting the STE a second time and find it already live. Avoid the issue by giving arm_smmu_install_ste_for_dev() the cleverness to skip over duplicates. It seems mildly counterintuitive compared to preventing the duplicates from existing in the first place, but since the DT and ACPI probe paths build their fwspecs differently, this is actually the cleanest and most self-contained way to deal with it. Cc: <stable@vger.kernel.org> Fixes: 8f78515425da ("iommu/arm-smmu: Implement of_xlate() for SMMUv3") Reported-by: Tomasz Nowicki <tomasz.nowicki@caviumnetworks.com> Tested-by: Tomasz Nowicki <Tomasz.Nowicki@cavium.com> Tested-by: Jayachandran C. <jnair@caviumnetworks.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
| * iommu/arm-smmu-v3: Don't free page table ops twiceJean-Philippe Brucker2018-01-021-3/+5
|/ | | | | | | | | | | | Kasan reports a double free when finalise_stage_fn fails: the io_pgtable ops are freed by arm_smmu_domain_finalise and then again by arm_smmu_domain_free. Prevent this by leaving pgtbl_ops empty on failure. Cc: <stable@vger.kernel.org> Fixes: 48ec83bcbcf5 ("iommu/arm-smmu: Add initial driver support for ARM SMMUv3 devices") Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
* Merge branches 'iommu/arm/smmu', 'iommu/updates', 'iommu/vt-d', ↵Alex Williamson2017-11-131-0/+10
|\ | | | | | | 'iommu/ipmmu-vmsa' and 'iommu/iova' into iommu-next-20171113.0
| * iommu/io-pgtable-arm: Convert to IOMMU API TLB syncRobin Murphy2017-10-021-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | Now that the core API issues its own post-unmap TLB sync call, push that operation out from the io-pgtable-arm internals into the users. For now, we leave the invalidation implicit in the unmap operation, since none of the current users would benefit much from any change to that. CC: Magnus Damm <damm+renesas@opensource.se> CC: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
* | iommu/arm-smmu-v3: Use burst-polling for sync completionRobin Murphy2017-10-201-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | While CMD_SYNC is unlikely to complete immediately such that we never go round the polling loop, with a lightly-loaded queue it may still do so long before the delay period is up. If we have no better completion notifier, use similar logic as we have for SMMUv2 to spin a number of times before each backoff, so that we have more chance of catching syncs which complete relatively quickly and avoid delaying unnecessarily. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>