summaryrefslogtreecommitdiffstats
path: root/tcg/optimize.c
Commit message (Collapse)AuthorAgeFilesLines
* tcg: Remove movi and dupi opcodesRichard Henderson2021-01-131-4/+0Star
| | | | | | | | | These are now completely covered by mov from a TYPE_CONST temporary. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Convert tcg_gen_dupi_vec to TCG_CONSTRichard Henderson2021-01-131-5/+4Star
| | | | | | | | Because we now store uint64_t in TCGTemp, we can now always store the full 64-bit duplicate immediate. So remove the difference between 32- and 64-bit hosts. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/optimize: Use tcg_constant_internal with constant foldingRichard Henderson2021-01-131-59/+49Star
| | | | Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/optimize: Adjust TempOptInfo allocationRichard Henderson2021-01-131-26/+34
| | | | | | | | | | | | Do not allocate a large block for indexing. Instead, allocate for each temporary as they are seen. In general, this will use less memory, if we consider that most TBs do not touch every target register. This also allows us to allocate TempOptInfo for new temps created during optimization. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/optimize: Improve find_better_copyRichard Henderson2021-01-131-15/+12Star
| | | | | | Prefer TEMP_CONST over anything else. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Introduce TYPE_CONST temporariesRichard Henderson2021-01-131-2/+11
| | | | | | | | | These will hold a single constant for the duration of the TB. They are hashed, so that each value has one temp across the TB. Not used yet, this is all infrastructure. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Expand TempOptInfo to 64-bitsRichard Henderson2021-01-131-19/+21
| | | | | | | This propagates the extended value of TCGTemp.val that we did before. In addition, it will be required for vector constants. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Rename struct tcg_temp_info to TempOptInfoRichard Henderson2021-01-131-16/+16
| | | | | | | | Fix this name vs our coding style. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Consolidate 3 bits into enum TCGTempKindRichard Henderson2021-01-131-4/+4
| | | | | | | | | The temp_fixed, temp_global, temp_local bits are all related. Combine them into a single enumeration. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Introduce INDEX_op_qemu_st8_i32Richard Henderson2021-01-071-0/+1
| | | | | | | | | Enable this on i386 to restrict the set of input registers for an 8-bit store, as required by the architecture. This removes the last use of scratch registers for user-only mode. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/optimize: Add fallthrough annotationsThomas Huth2020-12-181-0/+4
| | | | | | | | | | | | | | | | To be able to compile this file with -Werror=implicit-fallthrough, we need to add some fallthrough annotations to the case statements that might fall through. Unfortunately, the typical "/* fallthrough */" comments do not work here as expected since some case labels are wrapped in macros and the compiler fails to match the comments in this case. But using __attribute__((fallthrough)) seems to work fine, so let's use that instead (by introducing a new QEMU_FALLTHROUGH macro in our compiler.h header file). Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20201211152426.350966-11-thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
* tcg: Revert "tcg/optimize: Flush data at labels not TCG_OPF_BB_END"Richard Henderson2020-11-041-18/+17Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit cd0372c515c4732d8bd3777cdd995c139c7ed7ea. The patch is incorrect in that it retains copies between globals and non-local temps, and non-local temps still die at the end of the BB. Failing test case for hppa: .globl _start _start: cmpiclr,= 0x24,%r19,%r0 cmpiclr,<> 0x2f,%r19,%r19 ---- 00010057 0001005b movi_i32 tmp0,$0x24 sub_i32 tmp1,tmp0,r19 mov_i32 tmp2,tmp0 mov_i32 tmp3,r19 movi_i32 tmp1,$0x0 ---- 0001005b 0001005f brcond_i32 tmp2,tmp3,eq,$L1 movi_i32 tmp0,$0x2f sub_i32 tmp1,tmp0,r19 mov_i32 tmp2,tmp0 mov_i32 tmp3,r19 movi_i32 tmp1,$0x0 mov_i32 r19,tmp1 setcond_i32 psw_n,tmp2,tmp3,ne set_label $L1 In this case, both copies of "mov_i32 tmp3,r19" are removed. The second because opt thought it was redundant. The first is removed later by liveness because tmp3 is known to be dead. This leaves the setcond_i32 with an uninitialized input. Revert the entire patch for 5.2, and a proper optimization across the branch may be considered for the next development cycle. Reported-by: qemu@igor2.repo.hu Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/optimize: Flush data at labels not TCG_OPF_BB_ENDRichard Henderson2020-10-271-17/+18
| | | | | | | | | We can easily propagate temp values through the entire extended basic block (in this case, the set of blocks connected by fallthru), simply by not discarding the register state at the branch. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/optimize: Fold dup2_vecRichard Henderson2020-10-081-0/+15
| | | | | | | When the two arguments are identical, this can be reduced to dup_vec or to mov_vec from a tcg_constant_vec. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Search includes from the project root source directoryPhilippe Mathieu-Daudé2020-01-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently search both the root and the tcg/ directories for tcg files: $ git grep '#include "tcg/' | wc -l 28 $ git grep '#include "tcg[^/]' | wc -l 94 To simplify the preprocessor search path, unify by expliciting the tcg/ directory. Patch created mechanically by running: $ for x in \ tcg.h tcg-mo.h tcg-op.h tcg-opc.h \ tcg-op-gvec.h tcg-gvec-desc.h; do \ sed -i "s,#include \"$x\",#include \"tcg/$x\"," \ $(git grep -l "#include \"$x\""); \ done Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc parts) Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200101112303.20724-2-philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: TCGMemOp is now accelerator independent MemOpTony Nguyen2019-09-031-1/+1
| | | | | | | | | | | | | | Preparation for collapsing the two byte swaps, adjust_endianness and handle_bswap, along the I/O path. Target dependant attributes are conditionalized upon NEED_CPU_H. Signed-off-by: Tony Nguyen <tony.nguyen@bt.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Acked-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <81d9cd7d7f5aaadfa772d6c48ecee834e9cf7882.1566466906.git.tony.nguyen@bt.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* Clean up inclusion of exec/cpu-common.hMarkus Armbruster2019-08-161-1/+0Star
| | | | | | | | | | migration/qemu-file.h neglects to include it even though it needs ram_addr_t. Fix that. Drop a few superfluous inclusions elsewhere. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190812052359.30071-14-armbru@redhat.com>
* tcg: Fix constant folding of INDEX_op_extract2_i32Richard Henderson2019-07-141-2/+2
| | | | | | | | | | | On a 64-bit host, discard any replications of the 32-bit sign bit when performing the shift and merge. Fixes: https://bugs.launchpad.net/bugs/1834496 Tested-by: Christophe Lyon <christophe.lyon@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* Include qemu-common.h exactly where neededMarkus Armbruster2019-06-121-1/+0Star
| | | | | | | | | | | | | | | | No header includes qemu-common.h after this commit, as prescribed by qemu-common.h's file comment. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-5-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and net/tap-bsd.c fixed up]
* tcg: Do not recreate INDEX_op_neg_vec unless supportedRichard Henderson2019-05-131-2/+6
| | | | | | | | Use tcg_can_emit_vec_op instead of just TCG_TARGET_HAS_neg_vec, so that we check the type and vece for the actual operation. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Add INDEX_op_extract2_{i32,i64}Richard Henderson2019-04-241-0/+16
| | | | | | | This will let backends implement the double-word shift operation. Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Drop nargs from tcg_op_insert_{before,after}Emilio G. Cota2018-12-171-2/+2
| | | | | | | | | It's unused since 75e8b9b7aa0b95a761b9add7e2f09248b101a392. Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181209193749.12277-9-cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/optimize: Optimize bswapRichard Henderson2018-12-171-0/+12
| | | | | | | | Somehow we forgot these operations, once upon a time. This will allow immediate stores to have their bswap optimized away. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/optimize: Do not skip default processing of dup_vecRichard Henderson2018-08-061-2/+2
| | | | | | | | | | | | If we do not opimize away dup_vec, we must mark its output as changed. Fixes: 170ba88f45b Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com> Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com> Message-id: 20180805233258.31892-1-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* tcg/optimize: Handle vector opcodes during optimizeRichard Henderson2018-02-081-73/+77
| | | | | | | | | Trivial move and constant propagation. Some identity and constant function folding, but nothing that requires knowledge of the size of the vector element. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Generalize TCGOp parametersRichard Henderson2017-12-291-2/+2
| | | | | | | | We had two fields specific to INDEX_op_call. Rename these and add some macros so that the fields may be reused for other opcodes. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Dynamically allocate TCGOpsRichard Henderson2017-12-291-13/+3Star
| | | | | | | | | | With no fixed array allocation, we can't overflow a buffer. This will be important as optimizations related to host vectors may expand the number of ops used. Use QTAILQ to link the ops together. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: allocate optimizer temps with tcg_mallocEmilio G. Cota2017-10-241-23/+19Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Groundwork for supporting multiple TCG contexts. While at it, also allocate temps_used directly as a bitmap of the required size, instead of using a bitmap of TCG_MAX_TEMPS via TCGTempSet. Performance-wise we lose about 1.12% in a translation-heavy workload such as booting+shutting down debian-arm: Performance counter stats for 'taskset -c 0 arm-softmmu/qemu-system-arm \ -machine type=virt -nographic -smp 1 -m 4096 \ -netdev user,id=unet,hostfwd=tcp::2222-:22 \ -device virtio-net-device,netdev=unet \ -drive file=die-on-boot.qcow2,id=myblock,index=0,if=none \ -device virtio-blk-device,drive=myblock \ -kernel kernel.img -append console=ttyAMA0 root=/dev/vda1 \ -name arm,debug-threads=on -smp 1' (10 runs): exec time (s) Relative slowdown wrt original (%) --------------------------------------------------------------- original 20.213321616 0. tcg_malloc 20.441130078 1.1270214 TCGContext 20.477846517 1.3086662 g_malloc 20.780527895 2.8061013 The other two alternatives shown in the table are: - TCGContext: embed temps[TCG_MAX_TEMPS] and TCGTempSet used_temps in TCGContext. This is simple enough but it isn't faster than using tcg_malloc; moreover, it wastes memory. - g_malloc: allocate/deallocate both temps and used_temps every time tcg_optimize is executed. Suggested-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Use per-temp state data in optimizeRichard Henderson2017-10-241-182/+241
| | | | | | | | | While we're touching many of the lines anyway, adjust the naming of the functions to better distinguish when "TCGArg" vs "TCGTemp" should be used. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Add temp_global bit to TCGTempRichard Henderson2017-10-241-7/+8
| | | | | | | | This avoids needing to test the index of a temp against nb_globals. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Introduce arg_tempRichard Henderson2017-10-241-2/+2
| | | | | | Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Propagate args to op->args in optimizerRichard Henderson2017-10-241-203/+227
| | | | | | Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Merge opcode arguments into TCGOpRichard Henderson2017-10-241-3/+3
| | | | | | | | | Rather than have a separate buffer of 10*max_ops entries, give each opcode 10 entries. The result is actually a bit smaller and should have slightly more cache locality. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Add opcode for ctpopRichard Henderson2017-01-101-0/+14
| | | | | | | | | The number of actual invocations of ctpop itself does not warrent an opcode, but it is very helpful for POWER7 to use in generating an expansion for ctz. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Add clz and ctz opcodesRichard Henderson2017-01-101-0/+36
| | | | | Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/optimize: Fold movcond 0/1 into setcondRichard Henderson2017-01-101-0/+15
| | | | Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Add field extraction primitivesRichard Henderson2017-01-101-0/+29
| | | | | | | | Adds tcg_gen_extract_* and tcg_gen_sextract_* for extraction of fixed position bitfields, much like we already have for deposit. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/optimize: move default return out of if statementAlex Bennée2016-10-041-2/+1Star
| | | | | | | | | | | This is to appease sanitizer builds which complain that: "error: control reaches end of non-void function" Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20160930213106.20186-5-alex.bennee@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* tcg: Optimize fence instructionsPranith Kumar2016-09-161-0/+39
| | | | | | | | | | | | | | This commit optimizes fence instructions. Two optimizations are currently implemented: (1) unnecessary duplicate fence instructions, and (2) merging weaker fences into a stronger fence. [rth: Merge tcg_optimize_mb back into tcg_optimize, so that we only loop over the opcode stream once. Merge "unrelated" weaker barriers into one stronger barrier.] Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20160823134825.32578-1-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Lower indirect registers in a separate passRichard Henderson2016-08-051-29/+2Star
| | | | | | | | | | | | | | | | | Rather than rely on recursion during the middle of register allocation, lower indirect registers to loads and stores off the indirect base into plain temps. For an x86_64 host, with sufficient registers, this results in identical code, modulo the actual register assignments. For an i686 host, with insufficient registers, this means that temps can be (temporarily) spilled to the stack in order to satisfy an allocation. This as opposed to the possibility of not being able to spill, to allocate a register for the indirect base, in order to perform a spill. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Reorg TCGOp chainingRichard Henderson2016-08-051-6/+2Star
| | | | | | | | Instead of using -1 as end of chain, use 0, and link through the 0 entry as a fully circular double-linked list. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* exec: extract exec/tb-context.hPaolo Bonzini2016-05-191-1/+1
| | | | | | | | TCG backends do not need most of exec-all.h; extract what they actually need to a separate file or move it directly to tcg.h. The next patch will stop including exec-all.h from everywhere. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* qemu-common: push cpu.h inclusion out of qemu-common.hPaolo Bonzini2016-05-191-2/+1Star
| | | | Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* tcg: use tcg_debug_assert instead of assert (fix performance regression)Aurelien Jarno2016-04-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The TCG code is quite performance sensitive, but at the same time can also be quite tricky. That is why asserts that can be enabled with the --enable-debug-tcg configure option. This used to work the following way: | #include "config.h" | | ... | | #if !defined(CONFIG_DEBUG_TCG) && !defined(NDEBUG) | /* define it to suppress various consistency checks (faster) */ | #define NDEBUG | #endif | | ... | | #include <assert.h> Since commit 757e725b (tcg: Clean up includes) "config.h" as been replaced by "qemu/osdep.h" which itself includes <assert.h>. As a consequence the assertions are always enabled, even when using --disable-debug-tcg, causing a performance regression, especially on targets with many registers. For instance on qemu-system-ppc the speed difference is about 15%. tcg_debug_assert is controlled directly by CONFIG_DEBUG_TCG and already uses in some places. This patch replaces all the calls to assert into calss to tcg_debug_assert. Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Message-id: 1461228530-14852-1-git-send-email-aurelien@aurel32.net Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* tcg: Clean up includesPeter Maydell2016-01-291-3/+1Star
| | | | | | | | | | Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1453832250-766-16-git-send-email-peter.maydell@linaro.org
* tcg: Split trunc_shr_i32 opcode into extr[lh]_i64_i32Richard Henderson2015-08-241-11/+11
| | | | | | | Rather than allow arbitrary shift+trunc, only concern ourselves with low and high parts. This is all that was being used anyway. Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/optimize: add optimizations for ext_i32_i64 and extu_i32_i64 opsAurelien Jarno2015-08-241-0/+13
| | | | | | | | | They behave the same as ext32s_i64 and ext32u_i64 from the constant folding and zero propagation point of view, except that they can't be replaced by a mov, so we don't compute the affected value. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: rename trunc_shr_i32 into trunc_shr_i64_i32Aurelien Jarno2015-08-241-3/+3
| | | | | | | | | | | The op is sometimes named trunc_shr_i32 and sometimes trunc_shr_i64_i32, and the name in the README doesn't match the name offered to the frontends. Always use the long name to make it clear it is a size changing op. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/optimize: allow constant to have copiesAurelien Jarno2015-08-241-8/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that copies and constants are tracked separately, we can allow constant to have copies, deferring the choice to use a register or a constant to the register allocation pass. This prevent this kind of regular constant reloading: -OUT: [size=338] +OUT: [size=298] mov -0x4(%r14),%ebp test %ebp,%ebp jne 0x7ffbe9cb0ed6 mov $0x40002219f8,%rbp mov %rbp,(%r14) - mov $0x40002219f8,%rbp mov $0x4000221a20,%rbx mov %rbp,(%rbx) mov $0x4000000000,%rbp mov %rbp,(%r14) - mov $0x4000000000,%rbp mov $0x4000221d38,%rbx mov %rbp,(%rbx) mov $0x40002221a8,%rbp mov %rbp,(%r14) - mov $0x40002221a8,%rbp mov $0x4000221d40,%rbx mov %rbp,(%rbx) mov $0x4000019170,%rbp mov %rbp,(%r14) - mov $0x4000019170,%rbp mov $0x4000221d48,%rbx mov %rbp,(%rbx) mov $0x40000049ee,%rbp mov %rbp,0x80(%r14) mov %r14,%rdi callq 0x7ffbe99924d0 mov $0x4000001680,%rbp mov %rbp,0x30(%r14) mov 0x10(%r14),%rbp mov $0x4000001680,%rbp mov %rbp,0x30(%r14) mov 0x10(%r14),%rbp shl $0x20,%rbp mov (%r14),%rbx mov %ebx,%ebx mov %rbx,(%r14) or %rbx,%rbp mov %rbp,0x10(%r14) mov %rbp,0x90(%r14) mov 0x60(%r14),%rbx mov %rbx,0x38(%r14) mov 0x28(%r14),%rbx mov $0x4000220e60,%r12 mov %rbx,(%r12) mov $0x40002219c8,%rbx mov %rbp,(%rbx) mov 0x20(%r14),%rbp sub $0x8,%rbp mov $0x4000004a16,%rbx mov %rbx,0x0(%rbp) mov %rbp,0x20(%r14) mov $0x19,%ebp mov %ebp,0xa8(%r14) mov $0x4000015110,%rbp mov %rbp,0x80(%r14) xor %eax,%eax jmpq 0x7ffbebcae426 lea -0x5f6d72a(%rip),%rax # 0x7ffbe3d437b3 jmpq 0x7ffbebcae426 Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/optimize: track const/copy status separatelyAurelien Jarno2015-08-241-28/+14Star
| | | | | | | | | | | | Instead of using an enum which could be either a copy or a const, track them separately. This will be used in the next patch. Constants are tracked through a bool. Copies are tracked by initializing temp's next_copy and prev_copy to itself, allowing to simplify the code a bit. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>