summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* accel/tcg: allow plugin instrumentation to be disable via cflagsAlex Bennée2021-02-187-27/+49
| | | | | | | | | | | | | | | | | | | When icount is enabled and we recompile an MMIO access we end up double counting the instruction execution. To avoid this we introduce the CF_MEMI cflag which only allows memory instrumentation for the next TB (which won't yet have been counted). As this is part of the hashed compile flags we will only execute the generated TB while coming out of a cpu_io_recompile. While we are at it delete the old TODO. We might as well keep the translation handy as it's likely you will repeatedly hit it on each MMIO access. Reported-by: Aaron Lindsay <aaron@os.amperecomputing.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Aaron Lindsay <aaron@os.amperecomputing.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210213130325.14781-21-alex.bennee@linaro.org>
* accel/tcg: remove CF_NOCACHE and special casesAlex Bennée2021-02-182-39/+15Star
| | | | | | | | | | Now we no longer generate CF_NOCACHE blocks we can remove a bunch of the special case handling for them. While we are at it we can remove the unused tb->orig_tb field and save a few bytes on the TB structure. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210213130325.14781-20-alex.bennee@linaro.org>
* accel/tcg: re-factor non-RAM execution codeAlex Bennée2021-02-181-15/+15
| | | | | | | | | | There is no real need to use CF_NOCACHE here. As long as the TB isn't linked to other TBs or included in the QHT or jump cache then it will only get executed once. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210213130325.14781-19-alex.bennee@linaro.org>
* accel/tcg: cache single instruction TB on pending replay exceptionAlex Bennée2021-02-181-40/+4Star
| | | | | | | | | | | | | | Again there is no reason to jump through the nocache hoops to execute a single instruction block. We do have to add an additional wrinkle to the cpu_handle_interrupt case to ensure we let through a TB where we have specifically disabled icount for the block. As the last user of cpu_exec_nocache we can now remove the function. Further clean-up will follow in subsequent patches. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210213130325.14781-18-alex.bennee@linaro.org>
* accel/tcg: actually cache our partial icount TBAlex Bennée2021-02-181-8/+9
| | | | | | | | | | | | | When we exit a block under icount with instructions left to execute we might need a shorter than normal block to take us to the next deterministic event. Instead of creating a throwaway block on demand we use the existing compile flags mechanism to ensure we fetch (or compile and fetch) a block with exactly the number of instructions we need. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210213130325.14781-17-alex.bennee@linaro.org>
* tests/acceptance: add a new set of tests to exercise pluginsAlex Bennée2021-02-182-0/+92
| | | | | | | | | | | This is just a simple test to count the instructions executed by a kernel. However a later test will detect a failure condition when icount is enabled. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Wainer dos Santos Moschetta <wainersm@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210213130325.14781-16-alex.bennee@linaro.org>
* tests/plugin: expand insn test to detect duplicate instructionsAlex Bennée2021-02-184-1/+38
| | | | | | | | | | | | | | | | | | A duplicate insn is one that is appears to be executed twice in a row. This is currently possible due to -icount and cpu_io_recompile() causing a re-translation of a block. On it's own this won't trigger any tests though. The heuristics that the plugin use can't deal with the x86 rep instruction which (validly) will look like executing the same instruction several times. To avoid problems later we tweak the rules for x86 to run the "inline" version of the plugin. This also has the advantage of increasing coverage of the plugin code (see bugfix in previous commit). Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210213130325.14781-15-alex.bennee@linaro.org>
* target/sh4: Create superh_io_recompile_replay_branchRichard Henderson2021-02-182-12/+18
| | | | | | | | | | | Move the code from accel/tcg/translate-all.c to target/sh4/cpu.c. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210208233906.479571-5-richard.henderson@linaro.org> Message-Id: <20210213130325.14781-14-alex.bennee@linaro.org>
* target/mips: Create mips_io_recompile_replay_branchRichard Henderson2021-02-182-10/+20
| | | | | | | | | | | Move the code from accel/tcg/translate-all.c to target/mips/cpu.c. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210208233906.479571-4-richard.henderson@linaro.org> Message-Id: <20210213130325.14781-13-alex.bennee@linaro.org>
* accel/tcg: Create io_recompile_replay_branch hookRichard Henderson2021-02-182-4/+23
| | | | | | | | | | | | | | Create a hook in which to split out the mips and sh4 ifdefs from cpu_io_recompile. [AJB: s/stoped/stopped/] Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210208233906.479571-3-richard.henderson@linaro.org> Message-Id: <20210213130325.14781-12-alex.bennee@linaro.org>
* exec: Move TranslationBlock typedef to qemu/typedefs.hRichard Henderson2021-02-189-12/+8Star
| | | | | | | | | | | | This also means we don't need an extra declaration of the structure in hw/core/cpu.h. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210208233906.479571-2-richard.henderson@linaro.org> Message-Id: <20210213130325.14781-11-alex.bennee@linaro.org>
* accel/tcg/plugin-gen: fix the call signature for inline callbacksAlex Bennée2021-02-181-21/+11Star
| | | | | | | | | | | | | | | A recent change to the handling of constants in TCG changed the pattern of ops emitted for a constant add. We no longer emit a mov and the constant can be applied directly to the TCG_op_add arguments. This was causing SEGVs when running the insn plugin with arg=inline. Fix this by updating copy_add_i64 to do the right thing while also adding a comment at the top of the append section as an aide memoir if something like this happens again. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Cc: Emilio G. Cota <cota@braap.org> Message-Id: <20210213130325.14781-10-alex.bennee@linaro.org>
* contrib: Open brace '{' following struct go on the same linezhouyang2021-02-181-2/+1Star
| | | | | | | | | | | I found some style problems whil check the code using checkpatch.pl. This commit fixs the issue below: ERROR: that open brace { should be on the previous line Signed-off-by: zhouyang <zhouyang789@huawei.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20210118031004.1662363-6-zhouyang789@huawei.com> Message-Id: <20210213130325.14781-9-alex.bennee@linaro.org>
* contrib: space required after that ','zhouyang2021-02-181-6/+6
| | | | | | | | | | | I am reading contrib related code and found some style problems while check the code using checkpatch.pl. This commit fixs the issue below: ERROR: space required after that ',' Signed-off-by: zhouyang <zhouyang789@huawei.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20210118031004.1662363-5-zhouyang789@huawei.com> Message-Id: <20210213130325.14781-8-alex.bennee@linaro.org>
* contrib: Add spaces around operatorzhouyang2021-02-181-1/+1
| | | | | | | | | | | I am reading contrib related code and found some style problems while check the code using checkpatch.pl. This commit fixs the issue below: ERROR: spaces required around that '*' Signed-off-by: zhouyang <zhouyang789@huawei.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20210118031004.1662363-4-zhouyang789@huawei.com> Message-Id: <20210213130325.14781-7-alex.bennee@linaro.org>
* contrib: Fix some code style problems, ERROR: "foo * bar" should be "foo *bar"zhouyang2021-02-181-1/+1
| | | | | | | | | | | I am reading contrib related code and found some style problems while check the code using checkpatch.pl. This commit fixs the issue below: ERROR: "foo * bar" should be "foo *bar" Signed-off-by: zhouyang <zhouyang789@huawei.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20210118031004.1662363-3-zhouyang789@huawei.com> Message-Id: <20210213130325.14781-6-alex.bennee@linaro.org>
* contrib: Don't use '#' flag of printf formatzhouyang2021-02-184-6/+6
| | | | | | | | | | | I am reading contrib related code and found some style problems while check the code using checkpatch.pl. This commit fixs the misuse of '#' flag of printf format Signed-off-by: zhouyang <zhouyang789@huawei.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20210118031004.1662363-2-zhouyang789@huawei.com> Message-Id: <20210213130325.14781-5-alex.bennee@linaro.org>
* plugins: new hwprofile pluginAlex Bennée2021-02-183-0/+340
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a plugin intended to help with profiling access to various bits of system hardware. It only really makes sense for system emulation. It takes advantage of the recently exposed helper API that allows us to see the device name (memory region name) associated with a device. You can specify arg=read or arg=write to limit the tracking to just reads or writes (by default it does both). The pattern option: -plugin ./tests/plugin/libhwprofile.so,arg=pattern will allow you to see the access pattern to devices, eg: gic_cpu @ 0xffffffc010040000 off:00000000, 8, 1, 8, 1 off:00000000, 4, 1, 4, 1 off:00000000, 2, 1, 2, 1 off:00000000, 1, 1, 1, 1 The source option: -plugin ./tests/plugin/libhwprofile.so,arg=source will track the virtual source address of the instruction making the access: pl011 @ 0xffffffc010031000 pc:ffffffc0104c785c, 1, 4, 0, 0 pc:ffffffc0104c7898, 1, 4, 0, 0 pc:ffffffc010512bcc, 2, 1867, 0, 0 You cannot mix source and pattern. Finally the match option allow you to limit the tracking to just the devices you care about. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Robert Foley <robert.foley@linaro.org> Reviewed-by: Robert Foley <robert.foley@linaro.org> Message-Id: <20210213130325.14781-4-alex.bennee@linaro.org>
* plugins: add API to return a name for a IO deviceAlex Bennée2021-02-182-0/+26
| | | | | | | | | | This may well end up being anonymous but it should always be unique. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Clement Deschamps <clement.deschamps@greensocs.com> Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210213130325.14781-3-alex.bennee@linaro.org>
* hw/virtio/pci: include vdev name in registered PCI sectionsAlex Bennée2021-02-181-8/+14
| | | | | | | | | | | When viewing/debugging memory regions it is sometimes hard to figure out which PCI device something belongs to. Make the names unique by including the vdev name in the name string. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20210213130325.14781-2-alex.bennee@linaro.org>
* Merge remote-tracking branch ↵Peter Maydell2021-02-177-38/+158
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'remotes/dgilbert-gitlab/tags/pull-virtiofs-20210216' into staging virtiofsd pull 2021-02-16 Vivek's support for new FUSE KILLPRIV_V2 and some smaller cleanups. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> # gpg: Signature made Tue 16 Feb 2021 18:34:32 GMT # gpg: using RSA key 45F5C71B4A0CB7FB977A9FA90516331EBC5BFDE7 # gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full] # Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7 * remotes/dgilbert-gitlab/tags/pull-virtiofs-20210216: virtiofsd: Do not use a thread pool by default viriofsd: Add support for FUSE_HANDLE_KILLPRIV_V2 virtiofsd: Save error code early at the failure callsite tools/virtiofsd: Replace the word 'whitelist' virtiofsd: vu_dispatch locking should never fail virtiofsd: Allow to build it without the tools Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * virtiofsd: Do not use a thread pool by defaultVivek Goyal2021-02-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we created a thread pool (With 64 max threads per pool) for each virtqueue. We hoped that this will provide us with better scalability and performance. But in practice, we are getting better numbers in most of the cases when we don't create a thread pool at all and a single thread per virtqueue receives the request and processes it. Hence, I am proposing that we switch to no thread pool by default (equivalent of --thread-pool-size=0). This will provide out of box better performance to most of the users. In fact other users have confirmed that not using a thread pool gives them better numbers. So why not use this as default. It can be changed when somebody can fix the issues with thread pool performance. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Message-Id: <20210210182744.27324-2-vgoyal@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
| * viriofsd: Add support for FUSE_HANDLE_KILLPRIV_V2Vivek Goyal2021-02-164-8/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds basic support for FUSE_HANDLE_KILLPRIV_V2. virtiofsd can enable/disable this by specifying option "-o killpriv_v2/no_killpriv_v2". By default this is enabled as long as client supports it Enabling this option helps with performance in write path. Without this option, currently every write is first preceeded with a getxattr() operation to find out if security.capability is set. (Write is supposed to clear security.capability). With this option enabled, server is signing up for clearing security.capability on every WRITE and also clearing suid/sgid subject to certain rules. This gets rid of extra getxattr() call for every WRITE and improves performance. This is true when virtiofsd is run with option -o xattr. What does enabling FUSE_HANDLE_KILLPRIV_V2 mean for file server implementation. It needs to adhere to following rules. Thanks to Miklos for this summary. - clear "security.capability" on write, truncate and chown unconditionally - clear suid/sgid in case of following. Note, sgid is cleared only if group executable bit is set. o setattr has FATTR_SIZE and FATTR_KILL_SUIDGID set. o setattr has FATTR_UID or FATTR_GID o open has O_TRUNC and FUSE_OPEN_KILL_SUIDGID o create has O_TRUNC and FUSE_OPEN_KILL_SUIDGID flag set. o write has FUSE_WRITE_KILL_SUIDGID >From Linux VFS client perspective, here are the requirements. - caps are always cleared on chown/write/truncate - suid is always cleared on chown, while for truncate/write it is cleared only if caller does not have CAP_FSETID. - sgid is always cleared on chown, while for truncate/write it is cleared only if caller does not have CAP_FSETID as well as file has group execute permission. virtiofsd implementation has not changed much to adhere to above ruls. And reason being that current assumption is that we are running on Linux and on top of filesystems like ext4/xfs which already follow above rules. On write, truncate, chown, seucurity.capability is cleared. And virtiofsd drops CAP_FSETID if need be and that will lead to clearing of suid/sgid. But if virtiofsd is running on top a filesystem which breaks above assumptions, then it will have to take extra actions to emulate above. That's a TODO for later when need arises. Note: create normally is supposed to be called only when file does not exist. So generally there should not be any question of clearing setuid/setgid. But it is possible that after client checks that file is not present, some other client creates file on server and this race can trigger sending FUSE_CREATE. In that case, if O_TRUNC is set, we should clear suid/sgid if FUSE_OPEN_KILL_SUIDGID is also set. v3: - Resolved conflicts due to lo_inode_open() changes. - Moved capability code in lo_do_open() so that both lo_open() and lo_create() can benefit from common code. - Dropped changes to kernel headers as these are part of qemu already. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20210208224024.43555-3-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
| * virtiofsd: Save error code early at the failure callsiteVivek Goyal2021-02-161-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change error code handling slightly in lo_setattr(). Right now we seem to jump to out_err and assume that "errno" is valid and use that to send reply. But if caller has to do some other operations before jumping to out_err, then it does the dance of first saving errno to saverr and the restore errno before jumping to out_err. This makes it more confusing. I am about to make more changes where caller will have to do some work after error before jumping to out_err. I found it easier to change the convention a bit. That is caller saves error in "saverr" before jumping to out_err. And out_err uses "saverr" to send error back and does not rely on "errno" having actual error. v3: Resolved conflicts in lo_setattr() due to lo_inode_open() changes. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20210208224024.43555-2-vgoyal@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
| * tools/virtiofsd: Replace the word 'whitelist'Philippe Mathieu-Daudé2021-02-162-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Follow the inclusive terminology from the "Conscious Language in your Open Source Projects" guidelines [*] and replace the words "whitelist" appropriately. [*] https://github.com/conscious-lang/conscious-lang-docs/blob/main/faq.md Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210205171817.2108907-3-philmd@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
| * virtiofsd: vu_dispatch locking should never failGreg Kurz2021-02-161-14/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pthread_rwlock_rdlock() and pthread_rwlock_wrlock() can fail if a deadlock condition is detected or the current thread already owns the lock. They can also fail, like pthread_rwlock_unlock(), if the mutex wasn't properly initialized. None of these are ever expected to happen with fv_VuDev::vu_dispatch_rwlock. Some users already check the return value and assert, some others don't. Introduce rdlock/wrlock/unlock wrappers that just do the former and use them everywhere for improved consistency and robustness. This is just cleanup. It doesn't fix any actual issue. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20210203182434.93870-1-groug@kaod.org> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
| * virtiofsd: Allow to build it without the toolsWainer dos Santos Moschetta2021-02-161-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | This changed the Meson build script to allow virtiofsd be built even though the tools build is disabled, thus honoring the --enable-virtiofsd option. Fixes: cece116c939d219070b250338439c2d16f94e3da (configure: add option for virtiofsd) Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com> Message-Id: <20210201211456.1133364-2-wainersm@redhat.com> Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
* | Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into ↵Peter Maydell2021-02-1736-81/+735
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | staging * HVF fixes * Extra qos-test debugging output (Christian) * SEV secret address autodetection (James) * SEV-ES support (Thomas) * Relocatable paths bugfix (Stefan) * RR fix (Pavel) * EventNotifier fix (Greg) # gpg: Signature made Tue 16 Feb 2021 16:15:59 GMT # gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83 # gpg: issuer "pbonzini@redhat.com" # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini-gitlab/tags/for-upstream: (21 commits) replay: fix icount request when replaying clock access event_notifier: Set ->initialized earlier in event_notifier_init() hvf: Fetch cr4 before evaluating CPUID(1) target/i386/hvf: add rdmsr 35H MSR_CORE_THREAD_COUNT hvf: x86: Remove unused definitions target/i386/hvf: add vmware-cpuid-freq cpu feature hvf: Guard xgetbv call util/cutils: Skip "." when looking for next directory component tests/qtest/qos-test: dump QEMU command if verbose tests/qtest/qos-test: dump environment variables if verbose tests/qtest/qos-test: dump qos graph if verbose libqos/qgraph_internal: add qos_printf() and qos_printf_literal() libqos/qgraph: add qos_node_create_driver_named() sev/i386: Enable an SEV-ES guest based on SEV policy kvm/i386: Use a per-VM check for SMM capability sev/i386: Don't allow a system reset under an SEV-ES guest sev/i386: Allow AP booting under SEV-ES sev/i386: Require in-kernel irqchip support for SEV-ES guests sev/i386: Add initial support for SEV-ES sev: update sev-inject-launch-secret to make gpa optional ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
| * | replay: fix icount request when replaying clock accessPavel Dovgalyuk2021-02-165-35/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Record/replay provides REPLAY_CLOCK_LOCKED macro to access the clock when vm_clock_seqlock is locked. This macro is needed because replay internals operate icount. In locked case replay use icount_get_raw_locked for icount request, which prevents excess locking which leads to deadlock. But previously only record code used *_locked function and replay did not. Therefore sometimes clock access lead to deadlocks. This patch fixes clock access for replay too and uses *_locked icount access function. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Message-Id: <161347990483.1313189.8371838968343494161.stgit@pasha-ThinkPad-X280> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | event_notifier: Set ->initialized earlier in event_notifier_init()Greg Kurz2021-02-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise the call to event_notifier_set() is a nop, which causes the SLOF firmware on POWER to hang when booting from a virtio-scsi device: virtio_scsi_dataplane_start() virtio_scsi_vring_init() virtio_bus_set_host_notifier() <- assign == true event_notifier_init() <- active == 1 event_notifier_set() <- fails right away if !e->initialized Fixes: e34e47eb28c0 ("event_notifier: handle initialization failure better") Cc: mlevitsk@redhat.com Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <20210216120247.1293569-1-groug@kaod.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | hvf: Fetch cr4 before evaluating CPUID(1)Alexander Graf2021-02-161-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The CPUID function 1 has a bit called OSXSAVE which tells user space the status of the CR4.OSXSAVE bit. Our generic CPUID function injects that bit based on the status of CR4. With Hypervisor.framework, we do not synchronize full CPU state often enough for this function to see the CR4 update before guest user space asks for it. To be on the save side, let's just always synchronize it when we receive a CPUID(1) request. That way we can set the bit with real confidence. Reported-by: Asad Ali <asad@osaro.com> Signed-off-by: Alexander Graf <agraf@csgraf.de> Message-Id: <20210123004129.6364-1-agraf@csgraf.de> [RB: resolved conflict with another CPUID change] Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | target/i386/hvf: add rdmsr 35H MSR_CORE_THREAD_COUNTVladislav Yaroshchuk2021-02-162-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some guests (ex. Darwin-XNU) can attemp to read this MSR to retrieve and validate CPU topology comparing it to ACPI MADT content MSR description from Intel Manual: 35H: MSR_CORE_THREAD_COUNT: Configured State of Enabled Processor Core Count and Logical Processor Count Bits 15:0 THREAD_COUNT The number of logical processors that are currently enabled in the physical package Bits 31:16 Core_COUNT The number of processor cores that are currently enabled in the physical package Bits 63:32 Reserved Signed-off-by: Vladislav Yaroshchuk <yaroshchuk2000@gmail.com> Message-Id: <20210113205323.33310-1-yaroshchuk2000@gmail.com> [RB: reordered MSR definition and dropped u suffix from shift offset] Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | hvf: x86: Remove unused definitionsAlexander Graf2021-02-161-16/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The hvf i386 has a few struct and cpp definitions that are never used. Remove them. Suggested-by: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Alexander Graf <agraf@csgraf.de> Message-Id: <20210120224444.71840-3-agraf@csgraf.de> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | target/i386/hvf: add vmware-cpuid-freq cpu featureVladislav Yaroshchuk2021-02-161-1/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For `-accel hvf` cpu_x86_cpuid() is wrapped with hvf_cpu_x86_cpuid() to add paravirtualization cpuid leaf 0x40000010 https://lkml.org/lkml/2008/10/1/246 Leaf 0x40000010, Timing Information: EAX: (Virtual) TSC frequency in kHz. EBX: (Virtual) Bus (local apic timer) frequency in kHz. ECX, EDX: RESERVED (Per above, reserved fields are set to zero). On macOS TSC and APIC Bus frequencies can be readed by sysctl call with names `machdep.tsc.frequency` and `hw.busfrequency` This options is required for Darwin-XNU guest to be synchronized with host Leaf 0x40000000 not exposes HVF leaving hypervisor signature empty Signed-off-by: Vladislav Yaroshchuk <yaroshchuk2000@gmail.com> Message-Id: <20210122150518.3551-1-yaroshchuk2000@gmail.com> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | hvf: Guard xgetbv callHill Ma2021-02-161-12/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This prevents illegal instruction on cpus that do not support xgetbv. Buglink: https://bugs.launchpad.net/qemu/+bug/1758819 Reviewed-by: Cameron Esfahani <dirty@apple.com> Signed-off-by: Hill Ma <maahiuzeon@gmail.com> Message-Id: <X/6OJ7qk0W6bHkHQ@Hills-Mac-Pro.local> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | util/cutils: Skip "." when looking for next directory componentStefan Weil2021-02-161-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When looking for the next directory component, a "." component is now skipped. This fixes the path(s) used for firmware lookup for the prefix == bindir case which is standard for QEMU on Windows and where the internally used bindir value ends with "/.". Signed-off-by: Stefan Weil <sw@weilnetz.de> Message-Id: <20210208205752.2488774-1-sw@weilnetz.de> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | tests/qtest/qos-test: dump QEMU command if verboseChristian Schoenebeck2021-02-161-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If qtests are run in verbose mode (i.e. if --verbose CL argument was provided) then print the assembled qemu command line for each test. Use qos_printf() instead of g_test_message() to avoid the latter cluttering the output. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <110bef3595cb841dfa1b86733c174ac9774eb37e.1611704181.git.qemu_oss@crudebyte.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | tests/qtest/qos-test: dump environment variables if verboseChristian Schoenebeck2021-02-161-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If qtests are run in verbose mode (i.e. if --verbose CL argument was provided) then print all environment variables to stdout before running the individual tests. It is common nowadays, at least being able to output all config vectors in a build chain, especially if it is required to investigate build- and test-issues on foreign/remote machines, which includes environment variables. In the context of writing new test cases this is also useful for finding out whether there are already some existing options for common questions like is there a preferred location for writing test files to? Is there a maximum size for test data? Is there a deadline for running tests? Use qos_printf() instead of g_test_message() to avoid the latter cluttering the output. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <21d77b33c578d80b5bba1068e61fd3562958b3c2.1611704181.git.qemu_oss@crudebyte.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | tests/qtest/qos-test: dump qos graph if verboseChristian Schoenebeck2021-02-163-0/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If qtests were run in verbose mode (i.e. if --verbose CL argument was provided) then dump the generated qos graph (all nodes and edges, along with their current individual availability status) to stdout, which allows to identify problems in the created qos graph e.g. when writing new qos tests. See API doc comment on function qos_dump_graph() for details. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <6bffb6e38589fb2c06a2c1b5deed33f3e710fed1.1611704181.git.qemu_oss@crudebyte.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | libqos/qgraph_internal: add qos_printf() and qos_printf_literal()Christian Schoenebeck2021-02-161-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These two are macros wrapping regular printf() call. They are intended to be used instead of calling printf() directly in order to avoid breaking TAP output format. TAP output format is enabled by using --tap command line argument. Starting with glib 2.62 it is enabled by default. Unfortunately there is currently no public glib API available to check whether TAP output format is enabled. For that reason qos_printf() simply always prepends a '#' character for now. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <653a5ef61c5e7d160e4d6294e542c57ea324cee4.1611704181.git.qemu_oss@crudebyte.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | libqos/qgraph: add qos_node_create_driver_named()Christian Schoenebeck2021-02-163-3/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So far the qos subsystem of the qtest framework had the limitation that only one instance of the same official QEMU (QMP) driver name could be created for qtests. That's because a) the created qos node names must always be unique, b) the node name must match the official QEMU driver name being instantiated and c) all nodes are in a global space shared by all tests. This patch removes this limitation by introducing a new function qos_node_create_driver_named() which allows test case authors to specify a node name being different from the actual associated QEMU driver name. It fills the new 'qemu_name' field of QOSGraphNode for that purpose. Adjust build_driver_cmd_line() and qos_graph_node_set_availability() to correctly deal with either accessing node name vs. node's qemu_name correctly. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <3be962ff38f3396f8040deaa5ffdab525c4e0b16.1611704181.git.qemu_oss@crudebyte.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | sev/i386: Enable an SEV-ES guest based on SEV policyTom Lendacky2021-02-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the sev_es_enabled() function return value to be based on the SEV policy that has been specified. SEV-ES is enabled if SEV is enabled and the SEV-ES policy bit is set in the policy object. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com> Message-Id: <c69f81c6029f31fc4c52a9f35f1bd704362476a5.1611682609.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | kvm/i386: Use a per-VM check for SMM capabilityTom Lendacky2021-02-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SMM is not currently supported for an SEV-ES guest by KVM. Change the SMM capability check from a KVM-wide check to a per-VM check in order to have a finer-grained SMM capability check. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Eduardo Habkost <ehabkost@redhat.com> Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com> Message-Id: <f851903809e9d4e6a22d5dfd738dac8da991e28d.1611682609.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | sev/i386: Don't allow a system reset under an SEV-ES guestTom Lendacky2021-02-1611-0/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An SEV-ES guest does not allow register state to be altered once it has been measured. When an SEV-ES guest issues a reboot command, Qemu will reset the vCPU state and resume the guest. This will cause failures under SEV-ES. Prevent that from occuring by introducing an arch-specific callback that returns a boolean indicating whether vCPUs are resettable. Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: Jiaxun Yang <jiaxun.yang@flygoat.com> Cc: Aleksandar Rikalo <aleksandar.rikalo@syrmia.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: David Hildenbrand <david@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com> Message-Id: <1ac39c441b9a3e970e9556e1cc29d0a0814de6fd.1611682609.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | sev/i386: Allow AP booting under SEV-ESPaolo Bonzini2021-02-166-1/+151
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When SEV-ES is enabled, it is not possible modify the guests register state after it has been initially created, encrypted and measured. Normally, an INIT-SIPI-SIPI request is used to boot the AP. However, the hypervisor cannot emulate this because it cannot update the AP register state. For the very first boot by an AP, the reset vector CS segment value and the EIP value must be programmed before the register has been encrypted and measured. Search the guest firmware for the guest for a specific GUID that tells Qemu the value of the reset vector to use. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Message-Id: <22db2bfb4d6551aed661a9ae95b4fdbef613ca21.1611682609.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | sev/i386: Require in-kernel irqchip support for SEV-ES guestsTom Lendacky2021-02-161-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In prep for AP booting, require the use of in-kernel irqchip support. This lessens the Qemu support burden required to boot APs. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com> Message-Id: <e9aec5941e613456f0757f5a73869cdc5deea105.1611682609.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | sev/i386: Add initial support for SEV-ESTom Lendacky2021-02-164-2/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide initial support for SEV-ES. This includes creating a function to indicate the guest is an SEV-ES guest (which will return false until all support is in place), performing the proper SEV initialization and ensuring that the guest CPU state is measured as part of the launch. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Co-developed-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com> Message-Id: <2e6386cbc1ddeaf701547dd5677adf5ddab2b6bd.1611682609.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | sev: update sev-inject-launch-secret to make gpa optionalJames Bottomley2021-02-162-2/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the gpa isn't specified, it's value is extracted from the OVMF properties table located below the reset vector (and if this doesn't exist, an error is returned). OVMF has defined the GUID for the SEV secret area as 4c2eb361-7d9b-4cc3-8081-127c90d3d294 and the format of the <data> is: <base>|<size> where both are uint32_t. We extract <base> and use it as the gpa for the injection. Note: it is expected that the injected secret will also be GUID described but since qemu can't interpret it, the format is left undefined here. Signed-off-by: James Bottomley <jejb@linux.ibm.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20210204193939.16617-3-jejb@linux.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | pc: add parser for OVMF reset blockJames Bottomley2021-02-164-5/+123
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | OVMF is developing a mechanism for depositing a GUIDed table just below the known location of the reset vector. The table goes backwards in memory so all entries are of the form <data>|len|<GUID> Where <data> is arbtrary size and type, <len> is a uint16_t and describes the entire length of the entry from the beginning of the data to the end of the guid. The foot of the table is of this form and <len> for this case describes the entire size of the table. The table foot GUID is defined by OVMF as 96b582de-1fb2-45f7-baea-a366c55a082d and if the table is present this GUID is just below the reset vector, 48 bytes before the end of the firmware file. Add a parser for the ovmf reset block which takes a copy of the block, if the table foot guid is found, minus the footer and a function for later traversal to return the data area of any specified GUIDs. Signed-off-by: James Bottomley <jejb@linux.ibm.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20210204193939.16617-2-jejb@linux.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | Merge remote-tracking branch ↵Peter Maydell2021-02-1756-552/+2975
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'remotes/pmaydell/tags/pull-target-arm-20210217' into staging target-arm queue: * Support ARMv8.5-MemTag for linux-user * ncpm7xx: Support SMBus * MAINTAINERS: add section for Clock framework # gpg: Signature made Wed 17 Feb 2021 11:01:45 GMT # gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE # gpg: issuer "peter.maydell@linaro.org" # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate] # gpg: aka "Peter Maydell <pmaydell@gmail.com>" [ultimate] # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate] # Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE * remotes/pmaydell/tags/pull-target-arm-20210217: (37 commits) MAINTAINERS: add myself maintainer for the clock framework hw/i2c: Implement NPCM7XX SMBus Module FIFO Mode hw/i2c: Add a QTest for NPCM7XX SMBus Device hw/arm: Add I2C sensors and EEPROM for GSJ machine hw/arm: Add I2C sensors for NPCM750 eval board hw/i2c: Implement NPCM7XX SMBus Module Single Mode tests/tcg/aarch64: Add mte smoke tests target/arm: Enable MTE for user-only target/arm: Add allocation tag storage for user mode linux-user/aarch64: Signal SEGV_MTEAERR for async tag check error linux-user/aarch64: Signal SEGV_MTESERR for sync tag check fault linux-user/aarch64: Pass syndrome to EXC_*_ABORT target/arm: Split out syndrome.h from internals.h linux-user/aarch64: Implement PROT_MTE linux-user/aarch64: Implement PR_MTE_TCF and PR_MTE_TAG target/arm: Use the proper TBI settings for linux-user target/arm: Improve gen_top_byte_ignore linux-user/aarch64: Implement PR_TAGGED_ADDR_ENABLE linux-user: Handle tags in lock_user/unlock_user linux-user: Fix types in uaccess.c ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>