summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
| * | hw/display/omap_lcdc: Fix potential NULL pointer dereferenceAlexChen2020-11-021-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In omap_lcd_interrupts(), the pointer omap_lcd is dereferinced before being check if it is valid, which may lead to NULL pointer dereference. So move the assignment to surface after checking that the omap_lcd is valid and move surface_bits_per_pixel(surface) to after the surface assignment. Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: AlexChen <alex.chen@huawei.com> Message-id: 5F9CDB8A.9000001@huawei.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | hw/arm/boot: fix SVE for EL3 direct kernel bootRémi Denis-Courmont2020-11-021-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When booting a CPU with EL3 using the -kernel flag, set up CPTR_EL3 so that SVE will not trap to EL3. Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20201030151541.11976-1-remi@remlab.net Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | hw/arm/smmuv3: Fix potential integer overflow (CID 1432363)Philippe Mathieu-Daudé2020-11-021-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the BIT_ULL() macro to ensure we use 64-bit arithmetic. This fixes the following Coverity issue (OVERFLOW_BEFORE_WIDEN): CID 1432363 (#1 of 1): Unintentional integer overflow: overflow_before_widen: Potentially overflowing expression 1 << scale with type int (32 bits, signed) is evaluated using 32-bit arithmetic, and then used in a context that expects an expression of type hwaddr (64 bits, unsigned). Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Eric Auger <eric.auger@redhat.com> Message-id: 20201030144617.1535064-1-philmd@redhat.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | disas/capstone: Fix monitor disassembly of >32 bytesPeter Maydell2020-11-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we're using the capstone disassembler, disassembly of a run of instructions more than 32 bytes long disassembles the wrong data for instructions beyond the 32 byte mark: (qemu) xp /16x 0x100 0000000000000100: 0x00000005 0x54410001 0x00000001 0x00001000 0000000000000110: 0x00000000 0x00000004 0x54410002 0x3c000000 0000000000000120: 0x00000000 0x00000004 0x54410009 0x74736574 0000000000000130: 0x00000000 0x00000000 0x00000000 0x00000000 (qemu) xp /16i 0x100 0x00000100: 00000005 andeq r0, r0, r5 0x00000104: 54410001 strbpl r0, [r1], #-1 0x00000108: 00000001 andeq r0, r0, r1 0x0000010c: 00001000 andeq r1, r0, r0 0x00000110: 00000000 andeq r0, r0, r0 0x00000114: 00000004 andeq r0, r0, r4 0x00000118: 54410002 strbpl r0, [r1], #-2 0x0000011c: 3c000000 .byte 0x00, 0x00, 0x00, 0x3c 0x00000120: 54410001 strbpl r0, [r1], #-1 0x00000124: 00000001 andeq r0, r0, r1 0x00000128: 00001000 andeq r1, r0, r0 0x0000012c: 00000000 andeq r0, r0, r0 0x00000130: 00000004 andeq r0, r0, r4 0x00000134: 54410002 strbpl r0, [r1], #-2 0x00000138: 3c000000 .byte 0x00, 0x00, 0x00, 0x3c 0x0000013c: 00000000 andeq r0, r0, r0 Here the disassembly of 0x120..0x13f is using the data that is in 0x104..0x123. This is caused by passing the wrong value to the read_memory_func(). The intention is that at this point in the loop the 'cap_buf' buffer already contains 'csize' bytes of data for the instruction at guest addr 'pc', and we want to read in an extra 'tsize' bytes. Those extra bytes are therefore at 'pc + csize', not 'pc'. On the first time through the loop 'csize' happens to be zero, so the initial read of 32 bytes into cap_buf is correct and as long as the disassembly never needs to read more data we return the correct information. Use the correct guest address in the call to read_memory_func(). Cc: qemu-stable@nongnu.org Fixes: https://bugs.launchpad.net/qemu/+bug/1900779 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20201022132445.25039-1-peter.maydell@linaro.org
| * | target/arm: fix LORID_EL1 access checkRémi Denis-Courmont2020-11-021-14/+5Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | Secure mode is not exempted from checking SCR_EL3.TLOR, and in the future HCR_EL2.TLOR when S-EL2 is enabled. Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | target/arm: fix handling of HCR.FBRémi Denis-Courmont2020-11-021-3/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | HCR should be applied when NS is set, not when it is cleared. Signed-off-by: Rémi Denis-Courmont <remi.denis.courmont@huawei.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | target/arm: Fix VUDOT/VSDOT (scalar) on big-endian hostsPeter Maydell2020-11-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The helper functions for performing the udot/sdot operations against a scalar were not using an address-swizzling macro when converting the index of the scalar element into a pointer into the vm array. This had no effect on little-endian hosts but meant we generated incorrect results on big-endian hosts. For these insns, the index is indexing over group of 4 8-bit values, so 32 bits per indexed entity, and H4() is therefore what we want. (For Neon the only possible input indexes are 0 and 1.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20201028191712.4910-3-peter.maydell@linaro.org
| * | target/arm: Fix float16 pairwise Neon ops on big-endian hostsPeter Maydell2020-11-021-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the neon_padd/pmax/pmin helpers for float16, a cut-and-paste error meant we were using the H4() address swizzler macro rather than the H2() which is required for 2-byte data. This had no effect on little-endian hosts but meant we put the result data into the destination Dreg in the wrong order on big-endian hosts. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20201028191712.4910-2-peter.maydell@linaro.org
| * | target/arm: Improve do_prewiden_3dRichard Henderson2020-11-022-31/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can use proper widening loads to extend 32-bit inputs, and skip the "widenfn" step. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20201030022618.785675-12-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | target/arm: Simplify do_long_3d and do_2scalar_longRichard Henderson2020-11-021-14/+9Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In both cases, we can sink the write-back and perform the accumulate into the normal destination temps. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20201030022618.785675-11-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | target/arm: Rename neon_load_reg64 to vfp_load_reg64Richard Henderson2020-11-022-46/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The only uses of this function are for loading VFP double-precision values, and nothing to do with NEON. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20201030022618.785675-10-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | target/arm: Add read/write_neon_element64Richard Henderson2020-11-022-47/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | Replace all uses of neon_load/store_reg64 within translate-neon.c.inc. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20201030022618.785675-9-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | target/arm: Rename neon_load_reg32 to vfp_load_reg32Richard Henderson2020-11-022-94/+94
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The only uses of this function are for loading VFP single-precision values, and nothing to do with NEON. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20201030022618.785675-8-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | target/arm: Expand read/write_neon_element32 to all MemOpRichard Henderson2020-11-022-84/+37Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can then use this to improve VMOV (scalar to gp) and VMOV (gp to scalar) so that we simply perform the memory operation that we wanted, rather than inserting or extracting from a 32-bit quantity. These were the last uses of neon_load/store_reg, so remove them. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20201030022618.785675-7-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | target/arm: Add read/write_neon_element32Richard Henderson2020-11-022-99/+183
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Model these off the aa64 read/write_vec_element functions. Use it within translate-neon.c.inc. The new functions do not allocate or free temps, so this rearranges the calling code a bit. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20201030022618.785675-6-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | target/arm: Use neon_element_offset in vfp_reg_offsetRichard Henderson2020-11-021-9/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | This seems a bit more readable than using offsetof CPU_DoubleU. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20201030022618.785675-5-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | target/arm: Use neon_element_offset in neon_load/store_regRichard Henderson2020-11-021-12/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | These are the only users of neon_reg_offset, so remove that. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20201030022618.785675-4-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | target/arm: Move neon_element_offset to translate.cRichard Henderson2020-11-022-19/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | This will shortly have users outside of translate-neon.c.inc. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20201030022618.785675-3-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | target/arm: Introduce neon_full_reg_offsetRichard Henderson2020-11-023-23/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function makes it clear that we're talking about the whole register, and not the 32-bit piece at index 0. This fixes a bug when running on a big-endian host. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20201030022618.785675-2-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* | | Merge remote-tracking branch ↵Peter Maydell2020-11-0216-24/+832
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'remotes/dgilbert/tags/pull-migration-20201102a' into staging Migration and virtiofs fixes 2020-11-02 Fixes for postcopy migration test hang A seccomp crash for virtiofsd on some !x86 Help message and minor CID fix And another crack at Max's set. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> # gpg: Signature made Mon 02 Nov 2020 19:54:59 GMT # gpg: using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7 # gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full] # Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7 * remotes/dgilbert/tags/pull-migration-20201102a: tests/acceptance: Add virtiofs_submounts.py tests/acceptance/boot_linux: Accept SSH pubkey virtiofsd: Announce sub-mount points virtiofsd: Add mount ID to the lo_inode key meson.build: Check for statx() virtiofsd: Add attr_flags to fuse_entry_param virtiofsd: Check FUSE_SUBMOUNTS virtiofsd: Fix the help message of posix lock tools/virtiofsd: Check vu_init() return value (CID 1435958) virtiofsd: Seccomp: Add 'send' for syslog migration: Postpone the kick of the fault thread after recover migration: Unify reset of last_rb on destination node when recover Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | | tests/acceptance: Add virtiofs_submounts.pyMax Reitz2020-11-025-0/+662
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This test invokes several shell scripts to create a random directory tree full of submounts, and then check in the VM whether every submount has its own ID and the structure looks as expected. (Note that the test scripts must be non-executable, so Avocado will not try to execute them as if they were tests on their own, too.) Because at this commit's date it is unlikely that the Linux kernel on the image provided by boot_linux.py supports submounts in virtio-fs, the test will be cancelled if no custom Linux binary is provided through the vmlinuz parameter. (The on-image kernel can be used by providing an empty string via vmlinuz=.) So, invoking the test can be done as follows: $ avocado run \ tests/acceptance/virtiofs_submounts.py \ -p vmlinuz=/path/to/linux/build/arch/x86/boot/bzImage This test requires root privileges (through passwordless sudo -n), because at this point, virtiofsd requires them. (If you have a timestamp_timeout period for sudoers (e.g. the default of 5 min), you can provide this by executing something like "sudo true" before invoking Avocado.) Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20201102161859.156603-8-mreitz@redhat.com> Tested-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
| * | | tests/acceptance/boot_linux: Accept SSH pubkeyMax Reitz2020-11-021-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let download_cloudinit() take an optional pubkey, which subclasses of BootLinux can pass through setUp(). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Willian Rampazzo <willianr@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20201102161859.156603-7-mreitz@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
| * | | virtiofsd: Announce sub-mount pointsMax Reitz2020-11-022-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Whenever we encounter a directory with an st_dev or mount ID that differs from that of its parent, we set the FUSE_ATTR_SUBMOUNT flag so the guest can create a submount for it. We only need to do so in lo_do_lookup(). The following functions return a fuse_attr object: - lo_create(), though fuse_reply_create(): Calls lo_do_lookup(). - lo_lookup(), though fuse_reply_entry(): Calls lo_do_lookup(). - lo_mknod_symlink(), through fuse_reply_entry(): Calls lo_do_lookup(). - lo_link(), through fuse_reply_entry(): Creating a link cannot create a submount, so there is no need to check for it. - lo_getattr(), through fuse_reply_attr(): Announcing submounts when the node is first detected (at lookup) is sufficient. We do not need to return the submount attribute later. - lo_do_readdir(), through fuse_add_direntry_plus(): Calls lo_do_lookup(). Make announcing submounts optional, so submounts are only announced to the guest with the announce_submounts option. Some users may prefer the current behavior, so that the guest learns nothing about the host mount structure. (announce_submounts is force-disabled when the guest does not present the FUSE_SUBMOUNTS capability, or when there is no statx().) Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20201102161859.156603-6-mreitz@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
| * | | virtiofsd: Add mount ID to the lo_inode keyMax Reitz2020-11-022-10/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using st_dev is not sufficient to uniquely identify a mount: You can mount the same device twice, but those are still separate trees, and e.g. by mounting something else inside one of them, they may differ. Using statx(), we can get a mount ID that uniquely identifies a mount. If that is available, add it to the lo_inode key. Most of this patch is taken from Miklos's mail here: https://marc.info/?l=fuse-devel&m=160062521827983 (virtiofsd-use-mount-id.patch attachment) Suggested-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20201102161859.156603-5-mreitz@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
| * | | meson.build: Check for statx()Max Reitz2020-11-021-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check whether the glibc provides statx() and if so, define CONFIG_STATX. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20201102161859.156603-4-mreitz@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
| * | | virtiofsd: Add attr_flags to fuse_entry_paramMax Reitz2020-11-022-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fuse_entry_param is converted to fuse_attr on the line (by fill_entry()), so it should have a member that mirrors fuse_attr.flags. fill_entry() should then copy this fuse_entry_param.attr_flags to fuse_attr.flags. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20201102161859.156603-3-mreitz@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
| * | | virtiofsd: Check FUSE_SUBMOUNTSMax Reitz2020-11-022-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | FUSE_SUBMOUNTS is a pure indicator by the kernel to signal that it supports submounts. It does not check its state in the init reply, so there is nothing for fuse_lowlevel.c to do but to check its existence and copy it into fuse_conn_info.capable. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20201102161859.156603-2-mreitz@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
| * | | virtiofsd: Fix the help message of posix lockJiachen Zhang2020-11-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit 88fc107956a5812649e5918e0c092d3f78bb28ad disabled remote posix locks by default. But the --help message still says it is enabled by default. So fix it to output no_posix_lock. Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com> Message-Id: <20201027081558.29904-1-zhangjiachen.jaycee@bytedance.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
| * | | tools/virtiofsd: Check vu_init() return value (CID 1435958)Philippe Mathieu-Daudé2020-11-021-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 6f5fd837889, vu_init() can fail if malloc() returns NULL. This fixes the following Coverity warning: CID 1435958 (#1 of 1): Unchecked return value (CHECKED_RETURN) Fixes: 6f5fd837889 ("libvhost-user: support many virtqueues") Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20201102092339.2034297-1-philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
| * | | virtiofsd: Seccomp: Add 'send' for syslogDr. David Alan Gilbert2020-11-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On ppc, and some other archs, it looks like syslog ends up using 'send' rather than 'sendto'. Reference: https://github.com/kata-containers/kata-containers/issues/1050 Reported-by: amulmek1@in.ibm.com Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20201102150750.34565-1-dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
| * | | migration: Postpone the kick of the fault thread after recoverPeter Xu2020-11-021-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new migrate_send_rp_req_pages_pending() call should greatly improve destination responsiveness because it will resync faulted address after postcopy recovery. However it is also the 1st place to initiate the page request from the main thread. One thing is overlooked on that migrate_send_rp_message_req_pages() is not designed to be thread-safe. So if we wake the fault thread before syncing all the faulted pages in the main thread, it means they can race. Postpone the wake up operation after the sync of faulted addresses. Fixes: 0c26781c09 ("migration: Sync requested pages after postcopy recovery") Tested-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20201102153010.11979-3-peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
| * | | migration: Unify reset of last_rb on destination node when recoverPeter Xu2020-11-022-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When postcopy recover happens, we need to reset last_rb after each return of postcopy_pause_fault_thread() because that means we just got the postcopy migration continued. Unify this reset to the place right before we want to kick the fault thread again, when we get the command MIG_CMD_POSTCOPY_RESUME from source. This is actually more than that - because the main thread on destination will now be able to call migrate_send_rp_req_pages_pending() too, so the fault thread is not the only user of last_rb now. Move the reset earlier will allow the first call to migrate_send_rp_req_pages_pending() to use the reset value even if called from the main thread. (NOTE: this is not a real fix to 0c26781c09 mentioned below, however it is just a mark that when picking up 0c26781c09 we'd better have this one too; the real fix will come later) Fixes: 0c26781c09 ("migration: Sync requested pages after postcopy recovery") Tested-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20201102153010.11979-2-peterx@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
* | | | Merge remote-tracking branch 'remotes/nvme/tags/pull-nvme-20201102' into stagingPeter Maydell2020-11-0212-300/+1022
|\ \ \ \ | |/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nvme pull 2 Nov 2020 # gpg: Signature made Mon 02 Nov 2020 15:20:30 GMT # gpg: using RSA key DBC11D2D373B4A3755F502EC625156610A4F6CC0 # gpg: Good signature from "Keith Busch <kbusch@kernel.org>" [unknown] # gpg: aka "Keith Busch <keith.busch@gmail.com>" [unknown] # gpg: aka "Keith Busch <keith.busch@intel.com>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: DBC1 1D2D 373B 4A37 55F5 02EC 6251 5661 0A4F 6CC0 * remotes/nvme/tags/pull-nvme-20201102: (30 commits) hw/block/nvme: fix queue identifer validation hw/block/nvme: fix create IO SQ/CQ status codes hw/block/nvme: fix prp mapping status codes hw/block/nvme: report actual LBA data shift in LBAF hw/block/nvme: add trace event for requests with non-zero status code hw/block/nvme: add nsid to get/setfeat trace events hw/block/nvme: reject io commands if only admin command set selected hw/block/nvme: support for admin-only command set hw/block/nvme: validate command set selected hw/block/nvme: support per-namespace smart log hw/block/nvme: fix log page offset check hw/block/nvme: remove pointless rw indirection hw/block/nvme: update nsid when registered hw/block/nvme: change controller pci id pci: allocate pci id for nvme hw/block/nvme: support multiple namespaces hw/block/nvme: refactor identify active namespace id list hw/block/nvme: add support for sgl bit bucket descriptor hw/block/nvme: add support for scatter gather lists hw/block/nvme: harden cmb access ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | | hw/block/nvme: fix queue identifer validationGollu Appalanaidu2020-10-271-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nvme_check_{sq,cq} functions check if the given queue identifer is valid *and* that the queue exists. Thus, the function return value cannot simply be inverted to check if the identifer is valid and that the queue does *not* exist. Replace the call with an OR'ed version of the checks. Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
| * | | hw/block/nvme: fix create IO SQ/CQ status codesGollu Appalanaidu2020-10-271-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace the Invalid Field in Command with the Invalid PRP Offset status code in the nvme_create_{cq,sq} functions. Also, allow PRP1 to be address 0x0. Also replace the Completion Queue Invalid status code returned in nvme_create_cq when the the queue identifier is invalid with the Invalid Queue Identifier. The Completion Queue Invalid status code is exclusively for indicating that the completion queue identifer given when creating a submission queue is invalid. See NVM Express v1.3d, Section 5.3 ("Create I/O Completion Queue command") and 5.4("Create I/O Submission Queue command"). Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
| * | | hw/block/nvme: fix prp mapping status codesGollu Appalanaidu2020-10-273-18/+7Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Address 0 is not an invalid address. Remove those invalikd checks. Unaligned PRP2 and PRP list entries should result in Invalid PRP Offset status code and not Invalid Field. Fix that. See NVMe Express v1.3d, Section 4.3 ("Physical Region Page Entry and List"). Suggested-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Gollu Appalanaidu <anaidu.gollu@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
| * | | hw/block/nvme: report actual LBA data shift in LBAFDmitry Fomichev2020-10-271-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Calculate the data shift value to report based on the set value of logical_block_size device property. In the process, use a local variable to calculate the LBA format index instead of the hardcoded value 0. This makes the code more readable and it will make it easier to add support for multiple LBA formats in the future. Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
| * | | hw/block/nvme: add trace event for requests with non-zero status codeKlaus Jensen2020-10-272-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a command results in a non-zero status code, trace it. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
| * | | hw/block/nvme: add nsid to get/setfeat trace eventsKlaus Jensen2020-10-272-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Include the namespace id in the pci_nvme_{get,set}feat trace events. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
| * | | hw/block/nvme: reject io commands if only admin command set selectedKlaus Jensen2020-10-272-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the host sets CC.CSS to 111b, all commands submitted to I/O queues should be completed with status Invalid Command Opcode. Note that this is technically a v1.4 feature, but it does not hurt to implement before we finally bump the reported version implemented. Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
| * | | hw/block/nvme: support for admin-only command setKeith Busch2020-10-272-1/+3
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
| * | | hw/block/nvme: validate command set selectedKeith Busch2020-10-273-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fail to start the controller if the user requests a command set that the controller does not support. Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
| * | | hw/block/nvme: support per-namespace smart logKeith Busch2020-10-272-24/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let the user specify a specific namespace if they want to get access stats for a specific namespace. Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
| * | | hw/block/nvme: fix log page offset checkKeith Busch2020-10-271-12/+10Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return error if the requested offset starts after the size of the log being returned. Also, move the check for earlier in the function so we're not doing unnecessary calculations. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed- by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
| * | | hw/block/nvme: remove pointless rw indirectionKeith Busch2020-10-272-63/+29Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code switches on the opcode to invoke a function specific to that opcode. There's no point in consolidating back to a common function that just switches on that same opcode without any actual common code. Restore the opcode specific behavior without going back through another level of switches. Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
| * | | hw/block/nvme: update nsid when registeredKlaus Jensen2020-10-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the user does not specify an nsid parameter on the nvme-ns device, nvme_register_namespace will find the first free namespace id and assign that. This fix makes sure the assigned id is saved. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
| * | | hw/block/nvme: change controller pci idKlaus Jensen2020-10-273-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two reasons for changing this: 1. The nvme device currently uses an internal Intel device id. 2. Since commits "nvme: fix write zeroes offset and count" and "nvme: support multiple namespaces" the controller device no longer has the quirks that the Linux kernel think it has. As the quirks are applied based on pci vendor and device id, change them to get rid of the quirks. To keep backward compatibility, add a new 'use-intel-id' parameter to the nvme device to force use of the Intel vendor and device id. This is off by default but add a compat property to set this for 5.1 machines and older. If a 5.1 machine is booted (or the use-intel-id parameter is explicitly set to true), the Linux kernel will just apply these unnecessary quirks: 1. NVME_QUIRK_IDENTIFY_CNS which says that the device does not support anything else than values 0x0 and 0x1 for CNS (Identify Namespace and Identify Namespace). With multiple namespace support, this just means that the kernel will "scan" namespaces instead of using "Active Namespace ID list" (CNS 0x2). 2. NVME_QUIRK_DISABLE_WRITE_ZEROES. The nvme device started out with a broken Write Zeroes implementation which has since been fixed in commit 9d6459d21a6e ("nvme: fix write zeroes offset and count"). Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
| * | | pci: allocate pci id for nvmeKlaus Jensen2020-10-274-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The emulated nvme device (hw/block/nvme.c) is currently using an internal Intel device id. Prepare to change that by allocating a device id under the 1b36 (Red Hat, Inc.) vendor id. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Acked-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org>
| * | | hw/block/nvme: support multiple namespacesKlaus Jensen2020-10-276-114/+426
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for multiple namespaces by introducing a new 'nvme-ns' device model. The nvme device creates a bus named from the device name ('id'). The nvme-ns devices then connect to this and registers themselves with the nvme device. This changes how an nvme device is created. Example with two namespaces: -drive file=nvme0n1.img,if=none,id=disk1 -drive file=nvme0n2.img,if=none,id=disk2 -device nvme,serial=deadbeef,id=nvme0 -device nvme-ns,drive=disk1,bus=nvme0,nsid=1 -device nvme-ns,drive=disk2,bus=nvme0,nsid=2 The drive property is kept on the nvme device to keep the change backward compatible, but the property is now optional. Specifying a drive for the nvme device will always create the namespace with nsid 1. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Minwoo Im <minwoo.im.dev@gmail.com>
| * | | hw/block/nvme: refactor identify active namespace id listKlaus Jensen2020-10-271-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prepare to support inactive namespaces. Signed-off-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org>