summaryrefslogtreecommitdiffstats
path: root/hw
Commit message (Collapse)AuthorAgeFilesLines
* virtio-iommu: Add notify_flag_changed() memory region callbackBharat Bhushan2020-11-032-0/+16
| | | | | | | | | | | | Add notify_flag_changed() to notice when memory listeners are added and removed. Acked-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Message-Id: <20201030180510.747225-7-jean-philippe@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virtio-iommu: Add replay() memory region callbackBharat Bhushan2020-11-032-0/+41
| | | | | | | | | | | Implement the replay callback to setup all mappings for a new memory region. Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Message-Id: <20201030180510.747225-6-jean-philippe@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virtio-iommu: Call memory notifiers in attach/detachBharat Bhushan2020-11-031-0/+32
| | | | | | | | | | | | Call the memory notifiers when attaching an endpoint to a domain, to replay existing mappings, and when detaching the endpoint, to remove all mappings. Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Message-Id: <20201030180510.747225-5-jean-philippe@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virtio-iommu: Add memory notifiers for map/unmapBharat Bhushan2020-11-032-0/+58
| | | | | | | | | | | | Extend VIRTIO_IOMMU_T_MAP/UNMAP request to notify memory listeners. It will call VFIO notifier to map/unmap regions in the physical IOMMU. Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com> Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Message-Id: <20201030180510.747225-4-jean-philippe@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virtio-iommu: Store memory region in endpoint structJean-Philippe Brucker2020-11-031-1/+10
| | | | | | | | | | | Store the memory region associated to each endpoint into the endpoint structure, to allow efficient memory notification on map/unmap. Acked-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Message-Id: <20201030180510.747225-3-jean-philippe@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virtio-iommu: Fix virtio_iommu_mr()Jean-Philippe Brucker2020-11-031-1/+1
| | | | | | | | | | | | | | | Due to an invalid mask, virtio_iommu_mr() may return the wrong memory region. It hasn't been too problematic so far because the function was only used to test existence of an endpoint, but that is about to change. Fixes: cfb42188b24d ("virtio-iommu: Implement attach/detach command") Cc: QEMU Stable <qemu-stable@nongnu.org> Acked-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Message-Id: <20201030180510.747225-2-jean-philippe@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* hw/smbios: Fix leaked fd in save_opt_one() error pathPhilippe Mathieu-Daudé2020-11-031-1/+3
| | | | | | | | | | | | | | | | | | | Fix the following Coverity issue (RESOURCE_LEAK): CID 1432879: Resource leak Handle variable fd going out of scope leaks the handle. Replace a close() call by qemu_close() since the handle is opened with qemu_open(). Fixes: bb99f4772f5 ("hw/smbios: support loading OEM strings values from a file") Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20201030152742.1553968-1-philmd@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* hw/virtio/vhost-backend: Fix Coverity CID 1432871Philippe Mathieu-Daudé2020-11-031-2/+2
| | | | | | | | | | | | | | | | | | | Fix uninitialized value issues reported by Coverity: Field 'msg.reserved' is uninitialized when calling write(). While the 'struct vhost_msg' does not have a 'reserved' field, we still initialize it to have the two parts of the function consistent. Reported-by: Coverity (CID 1432864: UNINIT) Fixes: c471ad0e9bd ("vhost_net: device IOTLB support") Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20201103063541.2463363-1-philmd@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* hw/acpi : add spaces around operatorXinhao Zhang2020-11-031-1/+1
| | | | | | | | | | Fix code style. Operator needs spaces both sides. Signed-off-by: Xinhao Zhang <zhangxinhao1@huawei.com> Signed-off-by: Kai Deng <dengkai1@huawei.com> Message-Id: <20201103102634.273021-3-zhangxinhao1@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* hw/acpi : add space before the open parenthesis '('Xinhao Zhang2020-11-031-1/+1
| | | | | | | | | | Fix code style. Space required before the open parenthesis '('. Signed-off-by: Xinhao Zhang <zhangxinhao1@huawei.com> Signed-off-by: Kai Deng <dengkai1@huawei.com> Message-Id: <20201103102634.273021-2-zhangxinhao1@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* hw/acpi : Don't use '#' flag of printf formatXinhao Zhang2020-11-031-10/+10
| | | | | | | | | | | Fix code style. Don't use '#' flag of printf format ('%#') in format strings, use '0x' prefix instead Signed-off-by: Xinhao Zhang <zhangxinhao1@huawei.com> Signed-off-by: Kai Deng <dengkai1@huawei.com> Message-Id: <20201103102634.273021-1-zhangxinhao1@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virito-mem: Implement get_min_alignment()David Hildenbrand2020-11-031-0/+7
| | | | | | | | | | | | | | | | | | | | | The block size determines the alignment requirements. Implement get_min_alignment() of the TYPE_MEMORY_DEVICE interface. This allows auto-assignment of a properly aligned address in guest physical address space. For example, when specifying a 2GB block size for a virtio-mem device with 10GB with a memory setup "-m 4G, 20G", we'll no longer fail when realizing. Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20201008083029.9504-7-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* memory-device: Add get_min_alignment() callbackDavid Hildenbrand2020-11-031-2/+9
| | | | | | | | | | | | | | | | | | | | | Add a callback that can be used to express additional alignment requirements (exceeding the ones from the memory region). Will be used by virtio-mem to express special alignment requirements due to manually configured, big block sizes (e.g., 1GB with an ordinary memory-backend-ram). This avoids failing later when realizing, because auto-detection wasn't able to assign a properly aligned address. Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20201008083029.9504-6-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* memory-device: Support big alignment requirementsDavid Hildenbrand2020-11-031-4/+5
| | | | | | | | | | | | | | | | | | | | | | Let's warn instead of bailing out - the worst thing that can happen is that we'll fail hot/coldplug later. The user got warned, and this should be rare. This will be necessary for memory devices with rather big (user-defined) alignment requirements - say a virtio-mem device with a 2G block size - which will become important, for example, when supporting vfio in the future. Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20201008083029.9504-5-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virtio-mem: Probe THP size to determine default block sizeDavid Hildenbrand2020-11-031-4/+101
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's allow a minimum block size of 1 MiB in all configurations. Select the default block size based on - The page size of the memory backend. - The THP size if the memory backend size corresponds to the real host page size. - The global minimum of 1 MiB. and warn if something smaller is configured by the user. VIRTIO_MEM only supports Linux (depends on LINUX), so we can probe the THP size unconditionally. For now we only support virtio-mem on x86-64 - there isn't a user-visible change (x86-64 only supports 2 MiB THP on the PMD level) - the default was, and will be 2 MiB. If we ever have THP on the PUD level (e.g., 1 GiB THP on x86-64), we expect it to be more transparent - e.g., to only optimize fully populated ranges unless explicitly told /configured otherwise (in contrast to PMD THP). Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20201008083029.9504-4-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virtio-mem: Make sure "usable_region_size" is always multiples of the block sizeDavid Hildenbrand2020-11-031-0/+3
| | | | | | | | | | | | | | | | | | | | | | | The spec states: "The device MUST set addr, region_size, usable_region_size, plugged_size, requested_size to multiples of block_size." With block sizes > 256MB, we currently wouldn't guarantee that for the usable_region_size. Note that we cannot exceed the region_size, as we already enforce the alignment there properly. Fixes: 910b25766b33 ("virtio-mem: Paravirtualized memory hot(un)plug") Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20201008083029.9504-3-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* virtio-mem: Make sure "addr" is always multiples of the block sizeDavid Hildenbrand2020-11-031-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The spec states: "The device MUST set addr, region_size, usable_region_size, plugged_size, requested_size to multiples of block_size." In some cases, we currently don't guarantee that for "addr": For example, when starting a VM with 4 GiB boot memory and a virtio-mem device with a block size of 2 GiB, "memaddr"/"addr" will be auto-assigned to 0x140000000 (5 GiB). We'll try to improve auto-assignment for memory devices next, to avoid bailing out in case memory device code selects a bad address. Note: The Linux driver doesn't support such big block sizes yet. Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Fixes: 910b25766b33 ("virtio-mem: Paravirtualized memory hot(un)plug") Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20201008083029.9504-2-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* pc: comment style fixupMichael S. Tsirkin2020-11-031-4/+5
| | | | | | | Fix up checkpatch comment style warnings. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Chen Qun <kuhn.chenqun@huawei.com>
* Merge remote-tracking branch ↵Peter Maydell2020-11-035-8/+17
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'remotes/pmaydell/tags/pull-target-arm-20201102' into staging target-arm queue: * target/arm: Fix Neon emulation bugs on big-endian hosts * target/arm: fix handling of HCR.FB * target/arm: fix LORID_EL1 access check * disas/capstone: Fix monitor disassembly of >32 bytes * hw/arm/smmuv3: Fix potential integer overflow (CID 1432363) * hw/arm/boot: fix SVE for EL3 direct kernel boot * hw/display/omap_lcdc: Fix potential NULL pointer dereference * hw/display/exynos4210_fimd: Fix potential NULL pointer dereference * target/arm: Get correct MMU index for other-security-state * configure: Test that gio libs from pkg-config work * hw/intc/arm_gicv3_cpuif: Make GIC maintenance interrupts work * docs: Fix building with Sphinx 3 * tests/qtest/npcm7xx_rng-test: Disable randomness tests # gpg: Signature made Mon 02 Nov 2020 17:09:00 GMT # gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE # gpg: issuer "peter.maydell@linaro.org" # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate] # gpg: aka "Peter Maydell <pmaydell@gmail.com>" [ultimate] # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate] # Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE * remotes/pmaydell/tags/pull-target-arm-20201102: (26 commits) tests/qtest/npcm7xx_rng-test: Disable randomness tests qemu-option-trace.rst.inc: Don't use option:: markup scripts/kerneldoc: For Sphinx 3 use c:macro for macros with arguments hw/intc/arm_gicv3_cpuif: Make GIC maintenance interrupts work configure: Test that gio libs from pkg-config work target/arm: Get correct MMU index for other-security-state hw/display/exynos4210_fimd: Fix potential NULL pointer dereference hw/display/omap_lcdc: Fix potential NULL pointer dereference hw/arm/boot: fix SVE for EL3 direct kernel boot hw/arm/smmuv3: Fix potential integer overflow (CID 1432363) disas/capstone: Fix monitor disassembly of >32 bytes target/arm: fix LORID_EL1 access check target/arm: fix handling of HCR.FB target/arm: Fix VUDOT/VSDOT (scalar) on big-endian hosts target/arm: Fix float16 pairwise Neon ops on big-endian hosts target/arm: Improve do_prewiden_3d target/arm: Simplify do_long_3d and do_2scalar_long target/arm: Rename neon_load_reg64 to vfp_load_reg64 target/arm: Add read/write_neon_element64 target/arm: Rename neon_load_reg32 to vfp_load_reg32 ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * hw/intc/arm_gicv3_cpuif: Make GIC maintenance interrupts workPeter Maydell2020-11-021-3/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In gicv3_init_cpuif() we copy the ARMCPU gicv3_maintenance_interrupt into the GICv3CPUState struct's maintenance_irq field. This will only work if the board happens to have already wired up the CPU maintenance IRQ before the GIC was realized. Unfortunately this is not the case for the 'virt' board, and so the value that gets copied is NULL (since a qemu_irq is really a pointer to an IRQState struct under the hood). The effect is that the CPU interface code never actually raises the maintenance interrupt line. Instead, since the GICv3CPUState has a pointer to the CPUState, make the dereference at the point where we want to raise the interrupt, to avoid an implicit requirement on board code to wire things up in a particular order. Reported-by: Jose Martins <josemartins90@gmail.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20201009153904.28529-1-peter.maydell@linaro.org Reviewed-by: Luc Michel <luc@lmichel.fr>
| * hw/display/exynos4210_fimd: Fix potential NULL pointer dereferenceAlexChen2020-11-021-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | In exynos4210_fimd_update(), the pointer s is dereferinced before being check if it is valid, which may lead to NULL pointer dereference. So move the assignment to global_width after checking that the s is valid. Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Alex Chen <alex.chen@huawei.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 5F9F8D88.9030102@huawei.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * hw/display/omap_lcdc: Fix potential NULL pointer dereferenceAlexChen2020-11-021-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | In omap_lcd_interrupts(), the pointer omap_lcd is dereferinced before being check if it is valid, which may lead to NULL pointer dereference. So move the assignment to surface after checking that the omap_lcd is valid and move surface_bits_per_pixel(surface) to after the surface assignment. Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: AlexChen <alex.chen@huawei.com> Message-id: 5F9CDB8A.9000001@huawei.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * hw/arm/boot: fix SVE for EL3 direct kernel bootRémi Denis-Courmont2020-11-021-0/+3
| | | | | | | | | | | | | | | | | | | | When booting a CPU with EL3 using the -kernel flag, set up CPTR_EL3 so that SVE will not trap to EL3. Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20201030151541.11976-1-remi@remlab.net Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * hw/arm/smmuv3: Fix potential integer overflow (CID 1432363)Philippe Mathieu-Daudé2020-11-021-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the BIT_ULL() macro to ensure we use 64-bit arithmetic. This fixes the following Coverity issue (OVERFLOW_BEFORE_WIDEN): CID 1432363 (#1 of 1): Unintentional integer overflow: overflow_before_widen: Potentially overflowing expression 1 << scale with type int (32 bits, signed) is evaluated using 32-bit arithmetic, and then used in a context that expects an expression of type hwaddr (64 bits, unsigned). Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Eric Auger <eric.auger@redhat.com> Message-id: 20201030144617.1535064-1-philmd@redhat.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* | Merge remote-tracking branch 'remotes/nvme/tags/pull-nvme-20201102' into stagingPeter Maydell2020-11-027-295/+980
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nvme pull 2 Nov 2020 # gpg: Signature made Mon 02 Nov 2020 15:20:30 GMT # gpg: using RSA key DBC11D2D373B4A3755F502EC625156610A4F6CC0 # gpg: Good signature from "Keith Busch <kbusch@kernel.org>" [unknown] # gpg: aka "Keith Busch <keith.busch@gmail.com>" [unknown] # gpg: aka "Keith Busch <keith.busch@intel.com>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: DBC1 1D2D 373B 4A37 55F5 02EC 6251 5661 0A4F 6CC0 * remotes/nvme/tags/pull-nvme-20201102: (30 commits) hw/block/nvme: fix queue identifer validation hw/block/nvme: fix create IO SQ/CQ status codes hw/block/nvme: fix prp mapping status codes hw/block/nvme: report actual LBA data shift in LBAF hw/block/nvme: add trace event for requests with non-zero status code hw/block/nvme: add nsid to get/setfeat trace events hw/block/nvme: reject io commands if only admin command set selected hw/block/nvme: support for admin-only command set hw/block/nvme: validate command set selected hw/block/nvme: support per-namespace smart log hw/block/nvme: fix log page offset check hw/block/nvme: remove pointless rw indirection hw/block/nvme: update nsid when registered hw/block/nvme: change controller pci id pci: allocate pci id for nvme hw/block/nvme: support multiple namespaces hw/block/nvme: refactor identify active namespace id list hw/block/nvme: add support for sgl bit bucket descriptor hw/block/nvme: add support for scatter gather lists hw/block/nvme: harden cmb access ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | hw/block/nvme: fix queue identifer validationGollu Appalanaidu2020-10-271-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nvme_check_{sq,cq} functions check if the given queue identifer is valid *and* that the queue exists. Thus, the function return value cannot simply be inverted to check if the identifer is valid and that the queue does *not* exist. Replace the call with an OR'ed version of the checks. Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
| * | hw/block/nvme: fix create IO SQ/CQ status codesGollu Appalanaidu2020-10-271-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the Invalid Field in Command with the Invalid PRP Offset status code in the nvme_create_{cq,sq} functions. Also, allow PRP1 to be address 0x0. Also replace the Completion Queue Invalid status code returned in nvme_create_cq when the the queue identifier is invalid with the Invalid Queue Identifier. The Completion Queue Invalid status code is exclusively for indicating that the completion queue identifer given when creating a submission queue is invalid. See NVM Express v1.3d, Section 5.3 ("Create I/O Completion Queue command") and 5.4("Create I/O Submission Queue command"). Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
| * | hw/block/nvme: fix prp mapping status codesGollu Appalanaidu2020-10-272-18/+6Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Address 0 is not an invalid address. Remove those invalikd checks. Unaligned PRP2 and PRP list entries should result in Invalid PRP Offset status code and not Invalid Field. Fix that. See NVMe Express v1.3d, Section 4.3 ("Physical Region Page Entry and List"). Suggested-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
| * | hw/block/nvme: report actual LBA data shift in LBAFDmitry Fomichev2020-10-271-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Calculate the data shift value to report based on the set value of logical_block_size device property. In the process, use a local variable to calculate the LBA format index instead of the hardcoded value 0. This makes the code more readable and it will make it easier to add support for multiple LBA formats in the future. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
| * | hw/block/nvme: add trace event for requests with non-zero status codeKlaus Jensen2020-10-272-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | If a command results in a non-zero status code, trace it. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
| * | hw/block/nvme: add nsid to get/setfeat trace eventsKlaus Jensen2020-10-272-4/+4
| | | | | | | | | | | | | | | | | | | | | Include the namespace id in the pci_nvme_{get,set}feat trace events. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
| * | hw/block/nvme: reject io commands if only admin command set selectedKlaus Jensen2020-10-271-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the host sets CC.CSS to 111b, all commands submitted to I/O queues should be completed with status Invalid Command Opcode. Note that this is technically a v1.4 feature, but it does not hurt to implement before we finally bump the reported version implemented. Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
| * | hw/block/nvme: support for admin-only command setKeith Busch2020-10-271-0/+1
| | | | | | | | | | | | | | | Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
| * | hw/block/nvme: validate command set selectedKeith Busch2020-10-272-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | Fail to start the controller if the user requests a command set that the controller does not support. Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
| * | hw/block/nvme: support per-namespace smart logKeith Busch2020-10-271-24/+39
| | | | | | | | | | | | | | | | | | | | | | | | Let the user specify a specific namespace if they want to get access stats for a specific namespace. Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
| * | hw/block/nvme: fix log page offset checkKeith Busch2020-10-271-12/+10Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return error if the requested offset starts after the size of the log being returned. Also, move the check for earlier in the function so we're not doing unnecessary calculations. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed- by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
| * | hw/block/nvme: remove pointless rw indirectionKeith Busch2020-10-272-63/+29Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code switches on the opcode to invoke a function specific to that opcode. There's no point in consolidating back to a common function that just switches on that same opcode without any actual common code. Restore the opcode specific behavior without going back through another level of switches. Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
| * | hw/block/nvme: update nsid when registeredKlaus Jensen2020-10-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the user does not specify an nsid parameter on the nvme-ns device, nvme_register_namespace will find the first free namespace id and assign that. This fix makes sure the assigned id is saved. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
| * | hw/block/nvme: change controller pci idKlaus Jensen2020-10-273-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two reasons for changing this: 1. The nvme device currently uses an internal Intel device id. 2. Since commits "nvme: fix write zeroes offset and count" and "nvme: support multiple namespaces" the controller device no longer has the quirks that the Linux kernel think it has. As the quirks are applied based on pci vendor and device id, change them to get rid of the quirks. To keep backward compatibility, add a new 'use-intel-id' parameter to the nvme device to force use of the Intel vendor and device id. This is off by default but add a compat property to set this for 5.1 machines and older. If a 5.1 machine is booted (or the use-intel-id parameter is explicitly set to true), the Linux kernel will just apply these unnecessary quirks: 1. NVME_QUIRK_IDENTIFY_CNS which says that the device does not support anything else than values 0x0 and 0x1 for CNS (Identify Namespace and Identify Namespace). With multiple namespace support, this just means that the kernel will "scan" namespaces instead of using "Active Namespace ID list" (CNS 0x2). 2. NVME_QUIRK_DISABLE_WRITE_ZEROES. The nvme device started out with a broken Write Zeroes implementation which has since been fixed in commit 9d6459d21a6e ("nvme: fix write zeroes offset and count"). Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
| * | hw/block/nvme: support multiple namespacesKlaus Jensen2020-10-276-114/+426
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for multiple namespaces by introducing a new 'nvme-ns' device model. The nvme device creates a bus named from the device name ('id'). The nvme-ns devices then connect to this and registers themselves with the nvme device. This changes how an nvme device is created. Example with two namespaces: -drive file=nvme0n1.img,if=none,id=disk1 -drive file=nvme0n2.img,if=none,id=disk2 -device nvme,serial=deadbeef,id=nvme0 -device nvme-ns,drive=disk1,bus=nvme0,nsid=1 -device nvme-ns,drive=disk2,bus=nvme0,nsid=2 The drive property is kept on the nvme device to keep the change backward compatible, but the property is now optional. Specifying a drive for the nvme device will always create the namespace with nsid 1. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
| * | hw/block/nvme: refactor identify active namespace id listKlaus Jensen2020-10-271-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | Prepare to support inactive namespaces. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
| * | hw/block/nvme: add support for sgl bit bucket descriptorGollu Appalanaidu2020-10-271-6/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for SGL descriptor type 0x1 (bit bucket descriptor). See the NVM Express v1.3d specification, Section 4.4 ("Scatter Gather List (SGL)"). Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
| * | hw/block/nvme: add support for scatter gather listsKlaus Jensen2020-10-272-57/+276
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For now, support the Data Block, Segment and Last Segment descriptor types. See NVM Express 1.3d, Section 4.4 ("Scatter Gather List (SGL)"). Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
| * | hw/block/nvme: harden cmb accessKlaus Jensen2020-10-271-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the controller has only supported PRPs so far it has not been required to check the ending address (addr + len - 1) of the CMB access for validity since it has been guaranteed to be in range of the CMB. This changes when the controller adds support for SGLs (next patch), so add that check. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
| * | hw/block/nvme: default request status to successKlaus Jensen2020-10-271-3/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | Make the default request status NVME_SUCCESS so only error status codes have to be set. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
| * | hw/block/nvme: refactor aio submissionKlaus Jensen2020-10-273-39/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This pulls block layer aio submission/completion to common functions. For completions, additionally map an AIO error to the Unrecovered Read and Write Fault status codes. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
| * | hw/block/nvme: add symbolic command name to trace eventsKlaus Jensen2020-10-273-6/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add the symbolic command name to the pci_nvme_{io,admin}_cmd and pci_nvme_rw trace events. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
| * | hw/block/nvme: fix endian conversionKlaus Jensen2020-10-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The raw NLB field is a 16 bit value, so use le16_to_cpu instead of le32_to_cpu and cast to uint32_t before incrementing the value to not wrap around. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
| * | hw/block/nvme: add a lba to bytes helperKlaus Jensen2020-10-272-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | Add the nvme_l2b helper and use it for converting NLB and SLBA to byte counts and offsets. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
| * | hw/block/nvme: alignment style fixesKlaus Jensen2020-10-271-12/+13
| | | | | | | | | | | | | | | | | | | | | | | | Style fixes. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org>