bwlp/qemu.git - Experimental fork of QEMU with video encoding patches

	Commit message (Collapse)	Author	Age	Files	Lines
*	cpu-timers, icount: new modules	Claudio Fontana	2020-10-05	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	refactoring of cpus.c continues with cpu timer state extraction. cpu-timers: responsible for the softmmu cpu timers state, including cpu clocks and ticks. icount: counts the TCG instructions executed. As such it is specific to the TCG accelerator. Therefore, it is built only under CONFIG_TCG. One complication is due to qtest, which uses an icount field to warp time as part of qtest (qtest_clock_warp). In order to solve this problem, provide a separate counter for qtest. This requires fixing assumptions scattered in the code that qtest_enabled() implies icount_enabled(), checking each specific case. Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> [remove redundant initialization with qemu_spice_init] Reviewed-by: Alex Bennée <alex.bennee@linaro.org> [fix lingering calls to icount_get] Signed-off-by: Claudio Fontana <cfontana@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
*	disas: Move host asm annotations to tb_gen_code	Richard Henderson	2020-10-03	1	-9/+15
\| \| \| \| \| \| \| \| \| \| \| \|	Instead of creating GStrings and passing them into log_disas, just print the annotations directly in tb_gen_code. Fix the annotations for the slow paths of the TB, after the part implementing the final guest instruction. Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	qemu/atomic.h: rename atomic_ to qatomic_	Stefan Hajnoczi	2020-09-23	1	-27/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	clang's C11 atomic_fetch_() functions only take a C11 atomic type pointer argument. QEMU uses direct types (int, etc) and this causes a compiler error when a QEMU code calls these functions in a source file that also included <stdatomic.h> via a system header file: $ CC=clang CXX=clang++ ./configure ... && make ../util/async.c:79:17: error: address argument to atomic operation must be a pointer to _Atomic type ('unsigned int ' invalid) Avoid using atomic_*() names in QEMU's atomic.h since that namespace is used by <stdatomic.h>. Prefix QEMU's APIs with 'q' so that atomic.h and <stdatomic.h> can co-exist. I checked /usr/include on my machine and searched GitHub for existing "qatomic_" users but there seem to be none. This patch was generated using: $ git grep -h -o '\<atomic$64$\?_[a-z0-9_]\+' include/qemu/atomic.h \| \ sort -u >/tmp/changed_identifiers $ for identifier in $(</tmp/changed_identifiers); do sed -i "s%\<$identifier\>%q$identifier%g" \ $(git grep -I -l "\<$identifier\>") done I manually fixed line-wrap issues and misaligned rST tables. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200923105646.47864-1-stefanha@redhat.com>
*	accel/tcg: better handle memory constrained systems	Alex Bennée	2020-07-27	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It turns out there are some 64 bit systems that have relatively low amounts of physical memory available to them (typically CI system). Even with swapping available a 1GB translation buffer that fills up can put the machine under increased memory pressure. Detect these low memory situations and reduce tb_size appropriately. Fixes: 600e17b2615 ("accel/tcg: increase default code gen buffer size for 64 bit") Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Robert Foley <robert.foley@linaro.org> Cc: BALATON Zoltan <balaton@eik.bme.hu> Cc: Christian Ehrhardt <christian.ehrhardt@canonical.com> Message-Id: <20200724064509.331-7-alex.bennee@linaro.org>
*	osdep: Make MIN/MAX evaluate arguments only once	Eric Blake	2020-06-26	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I'm not aware of any immediate bugs in qemu where a second runtime evaluation of the arguments to MIN() or MAX() causes a problem, but proactively preventing such abuse is easier than falling prey to an unintended case down the road. At any rate, here's the conversation that sparked the current patch: https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg05718.html Update the MIN/MAX macros to only evaluate their argument once at runtime; this uses typeof(1 ? (a) : (b)) to ensure that we are promoting the temporaries to the same type as the final comparison (we have to trigger type promotion, as typeof(bitfield) won't compile; and we can't use typeof((a) + (b)) or even typeof((a) + 0), as some of our uses of MAX are on void* pointers where such addition is undefined). However, we are unable to work around gcc refusing to compile ({}) in a constant context (such as the array length of a static variable), even when only used in the dead branch of a __builtin_choose_expr(), so we have to provide a second macro pair MIN_CONST and MAX_CONST for use when both arguments are known to be compile-time constants and where the result must also be usable as a constant; this second form evaluates arguments multiple times but that doesn't matter for constants. By using a void expression as the expansion if a non-constant is presented to this second form, we can enlist the compiler to ensure the double evaluation is not attempted on non-constants. Alas, as both macros now rely on compiler intrinsics, they are no longer usable in preprocessor #if conditions; those will just have to be open-coded or the logic rewritten into #define or runtime 'if' conditions (but where the compiler dead-code-elimination will probably still apply). I tested that both gcc 10.1.1 and clang 10.0.0 produce errors for all forms of macro mis-use. As the errors can sometimes be cryptic, I'm demonstrating the gcc output: Use of MIN when MIN_CONST is needed: In file included from /home/eblake/qemu/qemu-img.c:25: /home/eblake/qemu/include/qemu/osdep.h:249:5: error: braced-group within expression allowed only inside a function 249 \| ({ \ \| ^ /home/eblake/qemu/qemu-img.c:92:12: note: in expansion of macro ‘MIN’ 92 \| char array[MIN(1, 2)] = ""; \| ^~~ Use of MIN_CONST when MIN is needed: /home/eblake/qemu/qemu-img.c: In function ‘is_allocated_sectors’: /home/eblake/qemu/qemu-img.c:1225:15: error: void value not ignored as it ought to be 1225 \| i = MIN_CONST(i, n); \| ^ Use of MIN in the preprocessor: In file included from /home/eblake/qemu/accel/tcg/translate-all.c:20: /home/eblake/qemu/accel/tcg/translate-all.c: In function ‘page_check_range’: /home/eblake/qemu/include/qemu/osdep.h:249:6: error: token "{" is not valid in preprocessor expressions 249 \| ({ \ \| ^ Fix the resulting callsites that used #if or computed a compile-time constant min or max to use the new macros. cpu-defs.h is interesting, as CPU_TLB_DYN_MAX_BITS is sometimes used as a constant and sometimes dynamic. It may be worth improving glib's MIN/MAX definitions to be saner, but that is a task for another day. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200625162602.700741-1-eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
*	translate-all: call qemu_spin_destroy for PageDesc	Emilio G. Cota	2020-06-16	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \|	The radix tree is append-only, but we can fail to insert a PageDesc if the insertion races with another thread. Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Robert Foley <robert.foley@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20200609200738.445-8-robert.foley@linaro.org> Message-Id: <20200612190237.30436-11-alex.bennee@linaro.org>
*	tcg: call qemu_spin_destroy for tb->jmp_lock	Emilio G. Cota	2020-06-16	1	-0/+8
\| \| \| \| \| \| \| \| \|	Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Robert Foley <robert.foley@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> [RF: minor changes + remove tb_destroy_func] Message-Id: <20200609200738.445-7-robert.foley@linaro.org> Message-Id: <20200612190237.30436-10-alex.bennee@linaro.org>
*	translate-all: include guest address in out_asm output	Alex Bennée	2020-05-15	1	-6/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already have information about where each guest instructions representation starts stored in the tcg_ctx->gen_insn_data so we can rectify the PC for faults. We can re-use this information to annotate the out_asm output with guest instruction address which makes it a bit easier to work out where you are especially with longer blocks. A minor wrinkle is that some instructions get optimised away so we have to scan forward until we find some actual generated code. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20200513175134.19619-11-alex.bennee@linaro.org>
*	disas: include an optional note for the start of disassembly	Alex Bennée	2020-05-15	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	This will become useful shortly for providing more information about output assembly inline. While there fix up the indenting and code formatting in disas(). Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200513175134.19619-9-alex.bennee@linaro.org>
*	accel/tcg: Relax va restrictions on 64-bit guests	Richard Henderson	2020-05-15	1	-6/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We cannot at present limit a 64-bit guest to a virtual address space smaller than the host. It will mostly work to ignore this limitation, except if the guest uses high bits of the address space for tags. But it will certainly work better, as presently we can wind up failing to allocate the guest stack. Widen our user-only page tree to the host or abi pointer width. Remove the workaround for this problem from target/alpha. Always validate guest addresses vs reserved_va, as there we control allocation ourselves. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20200513175134.19619-7-alex.bennee@linaro.org>
*	tcg: Remove softmmu code_gen_buffer fixed address	Richard Henderson	2020-03-28	1	-32/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The commentary talks about "in concert with the addresses assigned in the relevant linker script", except there is no linker script for softmmu, nor has there been for some time. (Do not confuse the user-only linker script editing that was removed in the previous patch, because user-only does not use this code_gen_buffer allocation method.) Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	accel/tcg: increase default code gen buffer size for 64 bit	Alex Bennée	2020-02-29	1	-9/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While 32mb is certainly usable a full system boot ends up flushing the codegen buffer nearly 100 times. Increase the default on 64 bit hosts to take advantage of all that spare memory. After this change I can boot my tests system without any TB flushes. As we usually run more CONFIG_USER binaries at a time in typical usage we aren't quite as profligate for user-mode code generation usage. We also bring the static code gen defies to the same place to keep all the reasoning in the comments together. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com> Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com> Message-Id: <20200228192415.19867-5-alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	accel/tcg: only USE_STATIC_CODE_GEN_BUFFER on 32 bit hosts	Alex Bennée	2020-02-29	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is no particular reason to use a static codegen buffer on 64 bit hosts as we have address space to burn. Allow the common CONFIG_USER case to use the mmap'ed buffers like SoftMMU. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com> Message-Id: <20200228192415.19867-4-alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	accel/tcg: remove link between guest ram and TCG cache size	Alex Bennée	2020-02-29	1	-8/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Basing the TB cache size on the ram_size was always a little heuristic and was broken by a1b18df9a4 which caused ram_size not to be fully realised at the time we initialise the TCG translation cache. The current DEFAULT_CODE_GEN_BUFFER_SIZE may still be a little small but follow-up patches will address that. Fixes: a1b18df9a4 Cc: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com> Message-Id: <20200228192415.19867-3-alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	accel/tcg: use units.h for defining code gen buffer sizes	Alex Bennée	2020-02-29	1	-9/+10
\| \| \| \| \| \| \| \| \| \| \| \|	It's easier to read. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200228192415.19867-2-alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	tcg: Search includes from the project root source directory	Philippe Mathieu-Daudé	2020-01-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently search both the root and the tcg/ directories for tcg files: $ git grep '#include "tcg/' \| wc -l 28 $ git grep '#include "tcg[^/]' \| wc -l 94 To simplify the preprocessor search path, unify by expliciting the tcg/ directory. Patch created mechanically by running: $ for x in \ tcg.h tcg-mo.h tcg-op.h tcg-opc.h \ tcg-op-gvec.h tcg-gvec-desc.h; do \ sed -i "s,#include \"$x\",#include \"tcg/$x\"," \ $(git grep -l "#include \"$x\""); \ done Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc parts) Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200101112303.20724-2-philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	qemu_log_lock/unlock now preserves the qemu_logfile handle.	Robert Foley	2019-12-18	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	qemu_log_lock() now returns a handle and qemu_log_unlock() receives a handle to unlock. This allows for changing the handle during logging and ensures the lock() and unlock() are for the same file. Also in target/tilegx/translate.c removed the qemu_log_lock()/unlock() calls (and the log("\n")), since the translator can longjmp out of the loop if it attempts to translate an instruction in an inaccessible page. Signed-off-by: Robert Foley <robert.foley@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20191118211528.3221-5-robert.foley@linaro.org>
*	Merge remote-tracking branch ↵	Peter Maydell	2019-10-30	1	-2/+13
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	'remotes/stsquad/tags/pull-tcg-plugins-281019-4' into staging TCG Plugins initial implementation - use --enable-plugins @ configure - low impact introspection (-plugin empty.so to measure overhead) - plugins cannot alter guest state - example plugins included in source tree (tests/plugins) - -d plugin to enable plugin output in logs - check-tcg runs extra tests when plugins enabled - documentation in docs/devel/plugins.rst # gpg: Signature made Mon 28 Oct 2019 15:13:23 GMT # gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44 # gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full] # Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44 * remotes/stsquad/tags/pull-tcg-plugins-281019-4: (57 commits) travis.yml: enable linux-gcc-debug-tcg cache MAINTAINERS: add me for the TCG plugins code scripts/checkpatch.pl: don't complain about (foo, /* empty */) .travis.yml: add --enable-plugins tests include/exec: wrap cpu_ldst.h in CONFIG_TCG accel/stubs: reduce headers from tcg-stub tests/plugin: add hotpages to analyse memory access patterns tests/plugin: add instruction execution breakdown tests/plugin: add a hotblocks plugin tests/tcg: enable plugin testing tests/tcg: drop test-i386-fprem from TESTS when not SLOW tests/tcg: move "virtual" tests to EXTRA_TESTS tests/tcg: set QEMU_OPTS for all cris runs tests/tcg/Makefile.target: fix path to config-host.mak tests/plugin: add sample plugins linux-user: support -plugin option vl: support -plugin option plugin: add qemu_plugin_outs helper plugin: add qemu_plugin_insn_disas helper plugin: expand the plugin_init function to include an info block ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
\| *	translate-all: notify plugin code of tb_flush	Emilio G. Cota	2019-10-28	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Plugins might allocate per-TB data that then they get passed each time a TB is executed (via the *userdata pointer). Notify plugin code every time a code cache flush occurs, so that plugins can then reclaim the memory of the per-TB data. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
\| *	translate-all: use cpu_in_exclusive_work_context() in tb_flush	Emilio G. Cota	2019-10-28	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tb_flush will be called by the plugin module from a safe work environment. Prepare for that. Suggested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
* \|	translate-all: Remove tb_alloc	Richard Henderson	2019-10-28	1	-18/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since 2ac01d6dafab, this function does only two things: assert a lock is held, and call tcg_tb_alloc. It is used exactly once, and its user has already done the assert. Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Clement Deschamps <clement.deschamps@greensocs.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* \|	translate-all: fix uninitialized tb->orig_tb	Clement Deschamps	2019-10-28	1	-0/+1
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes a segmentation fault in icount mode when executing from an IO region. TB is marked as CF_NOCACHE but tb->orig_tb is not initialized (equals previous value in code_gen_buffer). The issue happens in cpu_io_recompile() when it tries to invalidate orig_tb. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Clement Deschamps <clement.deschamps@greensocs.com> Message-Id: <20191022140016.918371-1-clement.deschamps@greensocs.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	cputlb: Pass retaddr to tb_check_watchpoint	Richard Henderson	2019-09-25	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \|	Fixes the previous TLB_WATCHPOINT patches because we are currently failing to set cpu->mem_io_pc with the call to cpu_check_watchpoint. Pass down the retaddr directly because it's readily available. Fixes: 50b107c5d61 Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	cputlb: Pass retaddr to tb_invalidate_phys_page_fast	Richard Henderson	2019-09-25	1	-20/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rather than rely on cpu->mem_io_pc, pass retaddr down directly. Within tb_invalidate_phys_page_range__locked, the is_cpu_write_access parameter is non-zero exactly when retaddr would be non-zero, so that is a simple replacement. Recognize that current_tb_not_found is true only when mem_io_pc (and now retaddr) are also non-zero, so remove a redundant test. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	cputlb: Remove tb_invalidate_phys_page_range is_cpu_write_access	Richard Henderson	2019-09-25	1	-4/+2
\| \| \| \| \| \| \| \| \|	All callers pass false to this argument. Remove it and pass the constant on to tb_invalidate_phys_page_range__locked. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	Include qemu-common.h exactly where needed	Markus Armbruster	2019-06-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	No header includes qemu-common.h after this commit, as prescribed by qemu-common.h's file comment. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-5-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and net/tap-bsd.c fixed up]
*	qemu-common: Move tcg_enabled() etc. to sysemu/tcg.h	Markus Armbruster	2019-06-11	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Other accelerators have their own headers: sysemu/hax.h, sysemu/hvf.h, sysemu/kvm.h, sysemu/whpx.h. Only tcg_enabled() & friends sit in qemu-common.h. This necessitates inclusion of qemu-common.h into headers, which is against the rules spelled out in qemu-common.h's file comment. Move tcg_enabled() & friends into their own header sysemu/tcg.h, and adjust #include directives. Cc: Richard Henderson <rth@twiddle.net> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-2-armbru@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> [Rebased with conflicts resolved automatically, except for accel/tcg/tcg-all.c]
*	cpu: Move icount_decr to CPUNegativeOffsetState	Richard Henderson	2019-06-10	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	Amusingly, we had already ignored the comment to keep this value at the end of CPUState. This restores the minimum negative offset from TCG_AREG0 for code generation. For the couple of uses within qom/cpu.c, without NEED_CPU_H, add a pointer from the CPUState object to the IcountDecr object within CPUNegativeOffsetState. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	cpu: Replace ENV_GET_CPU with env_cpu	Richard Henderson	2019-06-10	1	-1/+1
\| \| \| \| \| \| \| \| \|	Now that we have both ArchCPU and CPUArchState, we can define this generically instead of via macro in each target's cpu.h. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	tcg: Restart after TB code generation overflow	Richard Henderson	2019-04-24	1	-6/+32
\| \| \| \| \| \| \| \|	If a TB generates too much code, try again with fewer insns. Fixes: https://bugs.launchpad.net/bugs/1824853 Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	tcg: Hoist max_insns computation to tb_gen_code	Richard Henderson	2019-04-24	1	-2/+13
\| \| \| \| \| \| \| \| \| \|	In order to handle TB's that translate to too much code, we need to place the control of the length of the translation in the hands of the code gen master loop. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	tcg: Simplify how dump_exec_info() prints	Markus Armbruster	2019-04-18	1	-22/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	dump_exec_info() takes an fprintf()-like callback and a FILE * to pass to it. Its only caller hmp_info_jit() passes monitor_fprintf() and the current monitor cast to FILE *. monitor_fprintf() casts it right back, and is otherwise identical to monitor_printf(). The type-punning is ugly. Drop the callback, and call qemu_printf() instead. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190417191805.28198-5-armbru@redhat.com>
*	tcg: Simplify how dump_opcount_info() prints	Markus Armbruster	2019-04-18	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	dump_opcount_info() takes an fprintf()-like callback and a FILE * to pass to it. Its only caller hmp_info_opcount() passes monitor_fprintf() and the current monitor cast to FILE *. monitor_fprintf() casts it right back, and is otherwise identical to monitor_printf(). The type-punning is ugly. Drop the callback, and call qemu_printf() instead. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20190417191805.28198-4-armbru@redhat.com>
*	tcg: Fix LGPL version number	Thomas Huth	2019-01-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	It's either "GNU Library General Public version 2" or "GNU Lesser General Public version 2.1", but there was no "version 2.0" of the "Lesser" library. So assume that version 2.1 is meant here. Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1548252536-6242-5-git-send-email-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
*	accel/tcg: Add cluster number to TCG TB hash	Peter Maydell	2019-01-29	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Include the cluster number in the hash we use to look up TBs. This is important because a TB that is valid for one cluster at a given physical address and set of CPU flags is not necessarily valid for another: the two clusters may have different views of physical memory, or may have different CPU features (eg FPU present or absent). We put the cluster number in the high 8 bits of the TB cflags. This gives us up to 256 clusters, which should be enough for anybody. If we ever need more, or need more bits in cflags for other purposes, we could make tb_hash_func() take more data (and expand qemu_xxhash7() to qemu_xxhash8()). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Message-id: 20190121152218.9592-4-peter.maydell@linaro.org
*	build-sys: don't include windows.h, osdep.h does it	Marc-André Lureau	2019-01-11	1	-4/+0
\| \| \| \| \| \| \| \|	osdep.h will also define the available Windows API version for QEMU. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20181122110039.15972-2-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
*	cputlb: Count "partial" and "elided" tlb flushes	Richard Henderson	2018-10-31	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Our only statistic so far was "full" tlb flushes, where all mmu_idx are flushed at the same time. Now count "partial" tlb flushes where sets of mmu_idx are flushed, but the set is not maximal. Account one per mmu_idx flushed, as that is the unit of work performed. We don't actually count elided flushes yet, but go ahead and change the interface presented to the monitor all at once. Tested-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	tcg: access cpu->icount_decr.u16.high with atomics	Emilio G. Cota	2018-10-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Consistently access u16.high with atomics to avoid undefined behaviour in MTTCG. Note that icount_decr.u16.low is only used in icount mode, so regular accesses to it are OK. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181010144853.13005-2-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	accel/tcg: Remove dead code	Thomas Huth	2018-10-02	1	-9/+0
\| \| \| \| \| \| \| \| \| \| \|	The global cpu_single_env variable has been removed more than 5 years ago, so apparently nobody used this dead debug code in that timeframe anymore. Thus let's remove it completely now. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <1537204134-15905-1-git-send-email-thuth@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
*	qht: drop ht argument from qht iterators	Emilio G. Cota	2018-09-26	1	-4/+2
\| \| \| \| \| \| \| \| \| \|	Accessing the HT from an iterator results almost always in a deadlock. Given that only one qht-internal function uses this argument, drop it from the interface. Suggested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	accel/tcg: tb_gen_code(): Create single-insn TB for execution from non-RAM	Peter Maydell	2018-08-14	1	-1/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If get_page_addr_code() returns -1, this indicates that there is no RAM page we can read a full TB from. Instead we must create a TB which contains a single instruction and which we do not cache, so it is executed only once. Since this means we can now have TBs which are not in any page list, we also need to make tb_phys_invalidate() handle them (by not trying to remove them from a nonexistent page list). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Tested-by: Cédric Le Goater <clg@kaod.org> Message-id: 20180710160013.26559-5-peter.maydell@linaro.org
*	accel/tcg: Handle get_page_addr_code() returning -1 in tb_check_watchpoint()	Peter Maydell	2018-08-14	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	When we support execution from non-RAM MMIO regions, get_page_addr_code() will return -1 to indicate that there is no RAM at the requested address. Handle this in tb_check_watchpoint() -- if the exception happened for a PC which doesn't correspond to RAM then there is no need to invalidate any TBs, because the one-instruction TB will not have been cached. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Cédric Le Goater <clg@kaod.org> Message-id: 20180710160013.26559-4-peter.maydell@linaro.org
*	accel: Fix typo and grammar in comment	Stefan Weil	2018-07-16	1	-1/+1
\| \| \| \| \| \| \| \|	The typo was found by codespell. Signed-off-by: Stefan Weil <sw@weilnetz.de> Message-Id: <20180712194454.26765-1-sw@weilnetz.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
*	translate-all: honour CF_NOCACHE in tb_gen_code	Emilio G. Cota	2018-07-09	1	-15/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes a record-replay regression introduced by 95590e2 ("translate-all: discard TB when tb_link_page returns an existing matching TB", 2018-06-15). The problem is that code using CF_NOCACHE assumes that the TB returned from tb_gen_code is always a newly-generated one. This assumption, however, was broken in the aforementioned commit. Fix it by honouring CF_NOCACHE, so that tb_gen_code always returns a newly-generated TB when CF_NOCACHE is passed to it. Do this by avoiding the TB hash table if CF_NOCACHE is set. Reported-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Tested-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Signed-off-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Alistair Francis <alistair.francis@wdc.com> Message-id: 1530806837-5416-1-git-send-email-cota@braap.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	translate-all: fix locking of TBs whose two pages share the same physical page	Emilio G. Cota	2018-07-02	1	-7/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 0b5c91f ("translate-all: use per-page locking in !user-mode", 2018-06-15) introduced per-page locking. It assumed that the physical pages corresponding to a TB (at most two pages) are always distinct, which is wrong. For instance, an xtensa test provided by Max Filippov is broken by the commit, since the test maps two virtual pages to the same physical page: virt1: 7fff, virt2: 8000 phys1 6000fff, phys2 6000000 Fix it by removing the assumption from page_lock_pair. If the two physical page addresses are equal, we only lock the PageDesc once. Note that the two callers of page_lock_pair, namely page_unlock_tb and tb_link_page, are also updated so that we do not try to unlock the same PageDesc twice. Fixes: 0b5c91f74f3c83a36f37740969df8c775c997e69 Reported-by: Max Filippov <jcmvbkbc@gmail.com> Tested-by: Max Filippov <jcmvbkbc@gmail.com> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1529944302-14186-1-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging	Peter Maydell	2018-06-29	1	-22/+6
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* "info mtree" improvements (Alexey) * fake VPD block limits for SCSI passthrough (Daniel Barboza) * chardev and main loop fixes (Daniel Berrangé, Sergio, Stefan) * help fixes (Eduardo) * pc-dimm refactoring (David) * tests improvements and fixes (Emilio, Thomas) * SVM emulation fixes (Jan) * MemoryRegionCache fix (Eric) * WHPX improvements (Justin) * ESP cleanup (Mark) * -overcommit option (Michael) * qemu-pr-helper fixes (me) * "info pic" improvements for x86 (Peter) * x86 TCG emulation fixes (Richard) * KVM slot handling fix (Shannon) * Next round of deprecation (Thomas) * Windows dump format support (Viktor) # gpg: Signature made Fri 29 Jun 2018 12:03:05 BST # gpg: using RSA key BFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (60 commits) tests/boot-serial: Do not delete the output file in case of errors hw/scsi: add VPD Block Limits emulation hw/scsi: centralize SG_IO calls into single function hw/scsi: cleanups before VPD BL emulation dump: add Windows live system dump dump: add fallback KDBG using in Windows dump dump: use system context in Windows dump dump: add Windows dump format to dump-guest-memory i386/cpu: make -cpu host support monitor/mwait kvm: support -overcommit cpu-pm=on\|off hmp: obsolete "info ioapic" ioapic: support "info irq" ioapic: some proper indents when dump info ioapic: support "info pic" doc: another fix to "info pic" target-i386: Mark cpu_vmexit noreturn target-i386: Allow interrupt injection after STGI target-i386: Add NMI interception to SVM memory/hmp: Print owners/parents in "info mtree" WHPX: register for unrecognized MSR exits ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
\| *	move public invalidate APIs out of translate-all.{c,h}, clean up	Paolo Bonzini	2018-06-28	1	-22/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Place them in exec.c, exec-all.h and ram_addr.h. This removes knowledge of translate-all.h (which is an internal header) from several files outside accel/tcg and removes knowledge of AddressSpace from translate-all.c (as it only operates on ram_addr_t). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* \|	compiler: add a sizeof_field() macro	Stefan Hajnoczi	2018-06-27	1	-1/+1
\|/ \| \| \| \| \| \| \| \| \| \| \| \|	Determining the size of a field is useful when you don't have a struct variable handy. Open-coding this is ugly. This patch adds the sizeof_field() macro, which is similar to typeof_field(). Existing instances are updated to use the macro. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20180614164431.29305-1-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
*	tcg: remove tb_lock	Emilio G. Cota	2018-06-15	1	-92/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use mmap_lock in user-mode to protect TCG state and the page descriptors. In !user-mode, each vCPU has its own TCG state, so no locks needed. Per-page locks are used to protect the page descriptors. Per-TB locks are used in both modes to protect TB jumps. Some notes: - tb_lock is removed from notdirty_mem_write by passing a locked page_collection to tb_invalidate_phys_page_fast. - tcg_tb_lookup/remove/insert/etc have their own internal lock(s), so there is no need to further serialize access to them. - do_tb_flush is run in a safe async context, meaning no other vCPU threads are running. Therefore acquiring mmap_lock there is just to please tools such as thread sanitizer. - Not visible in the diff, but tb_invalidate_phys_page already has an assert_memory_lock. - cpu_io_recompile is !user-only, so no mmap_lock there. - Added mmap_unlock()'s before all siglongjmp's that could be called in user-mode while mmap_lock is held. + Added an assert for !have_mmap_lock() after returning from the longjmp in cpu_exec, just like we do in cpu_exec_step_atomic. Performance numbers before/after: Host: AMD Opteron(tm) Processor 6376 ubuntu 17.04 ppc64 bootup+shutdown time 700 +-+--+----+------+------------+-----------+--------------+-+ \| + + + + + B \| \| before *B* ** * \| \|tb lock removal ###D### * \| 600 +-+ * +-+ \| ** # \| \| B #D \| \| *** * ## \| 500 +-+ *** ### +-+ \| * *** ### \| \| B # ## \| \| ** * #D# \| 400 +-+ ## +-+ \| ### \| \| ## \| \| # ## \| 300 +-+ * B* #D# +-+ \| B *** ### \| \| * ** #### \| \| * *** ### \| 200 +-+ B B #D# +-+ \| #B * ## # \| \| #* ## \| \| + D##D# + + + + \| 100 +-+--+----+------+------------+-----------+------------+--+-+ 1 8 16 Guest CPUs 48 64 png: https://imgur.com/HwmBHXe debian jessie aarch64 bootup+shutdown time 90 +-+--+-----+-----+------------+------------+------------+--+-+ \| + + + + + + \| \| before *B* B \| 80 +tb lock removal ###D### D +-+ \| ### \| \| ## \| 70 +-+ # +-+ \| ## \| \| # \| 60 +-+ B ## +-+ \| * ## \| \| * #D \| 50 +-+ * ## +-+ \| * ### \| \| B* ### \| 40 +-+ ** # ## +-+ \| #D# \| \| B* ### \| 30 +-+ B*B #### +-+ \| B * * # ### \| \| B ###D# \| 20 +-+ D ##D## +-+ \| D# \| \| + + + + + + \| 10 +-+--+-----+-----+------------+------------+------------+--+-+ 1 8 16 Guest CPUs 48 64 png: https://imgur.com/iGpGFtv The gains are high for 4-8 CPUs. Beyond that point, however, unrelated lock contention significantly hurts scalability. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
*	translate-all: remove tb_lock mention from cpu_restore_state_from_tb	Emilio G. Cota	2018-06-15	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \|	tb_lock was needed when the function did retranslation. However, since fca8a500d519 ("tcg: Save insn data and use it in cpu_restore_state_from_tb") we don't do retranslation. Get rid of the comment. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>