summaryrefslogtreecommitdiffstats
path: root/tcg/aarch64
Commit message (Collapse)AuthorAgeFilesLines
...
* tcg/aarch64: Do not advertise minmax for MO_64Richard Henderson2019-05-141-4/+4
| | | | | | | The min/max instructions are not available for 64-bit elements. Fixes: 93f332a50371 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/aarch64: Support vector absolute valueRichard Henderson2019-05-142-1/+7
| | | | | Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Add support for vector absolute valueRichard Henderson2019-05-141-0/+1
| | | | | Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/aarch64: Support vector variable shift opcodesRichard Henderson2019-05-143-1/+45
| | | | | Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Add INDEX_op_dupm_vecRichard Henderson2019-05-141-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow the backend to expand dup from memory directly, instead of forcing the value into a temp first. This is especially important if integer/vector register moves do not exist. Note that officially tcg_out_dupm_vec is allowed to fail. If it did, we could fix this up relatively easily: VECE == 32/64: Load the value into a vector register, then dup. Both of these must work. VECE == 8/16: If the value happens to be at an offset such that an aligned load would place the desired value in the least significant end of the register, go ahead and load w/garbage in high bits. Load the value w/INDEX_op_ld{8,16}_i32. Attempt a move directly to vector reg, which may fail. Store the value into the backing store for OTS. Load the value into the vector reg w/TCG_TYPE_I32, which must work. Duplicate from the vector reg into itself, which must work. All of which is well and good, except that all supported hosts can support dupm for all vece, so all of the failure paths would be dead code and untestable. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/aarch64: Implement tcg_out_dupm_vecRichard Henderson2019-05-141-2/+35
| | | | | | | The LD1R instruction does all the work. Note that the only useful addressing mode is a base register with no offset. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Add tcg_out_dupm_vec to the backend interfaceRichard Henderson2019-05-131-0/+6
| | | | | | Currently stubbed out in all backends that support vectors. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Manually expand INDEX_op_dup_vecRichard Henderson2019-05-131-5/+4Star
| | | | | | | | | This case is similar to INDEX_op_mov_* in that we need to do different things depending on the current location of the source. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- v3: Added some commentary to the tcg_reg_alloc_* functions.
* tcg: Promote tcg_out_{dup,dupi}_vec to backend interfaceRichard Henderson2019-05-131-2/+10
| | | | | | | | | | | The i386 backend already has these functions, and the aarch64 backend could easily split out one. Nothing is done with these functions yet, but this will aid register allocation of INDEX_op_dup_vec in a later patch. Adjust the aarch64 tcg_out_dupi_vec signature to match the new interface. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Return bool success from tcg_out_movRichard Henderson2019-05-131-2/+3
| | | | | | | | | | | This patch merely changes the interface, aborting on all failures, of which there are currently none. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Restart TB generation after out-of-line ldst overflowRichard Henderson2019-04-241-6/+10
| | | | | | This is part c of relocation overflow handling. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/aarch64: Support INDEX_op_extract2_{i32,i64}Richard Henderson2019-04-242-2/+13
| | | | | Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Add INDEX_op_extract2_{i32,i64}Richard Henderson2019-04-241-0/+2
| | | | | | | This will let backends implement the double-word shift operation. Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* cputlb: Remove static tlb sizingRichard Henderson2019-01-281-1/+0Star
| | | | | | | | Now that all tcg backends support TCG_TARGET_IMPLEMENTS_DYN_TLB, remove the define and the old code. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/aarch64: enable dynamic TLB sizingRichard Henderson2019-01-282-42/+60
| | | | | | Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: introduce dynamic TLB sizingEmilio G. Cota2019-01-281-0/+1
| | | | | | | | | | Disabled in all TCG backends for now. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20190116170114.26802-3-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/aarch64: Implement vector minmax arithmeticRichard Henderson2019-01-282-1/+25
| | | | Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/aarch64: Implement vector saturating arithmeticRichard Henderson2019-01-282-1/+25
| | | | Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Add opcodes for vector minmax arithmeticRichard Henderson2019-01-281-0/+1
| | | | Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Add opcodes for vector saturated arithmeticRichard Henderson2019-01-281-0/+1
| | | | Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Add TCG_TARGET_HAS_MEMORY_BSWAPRichard Henderson2018-12-171-0/+1
| | | | | | | | For now, defined universally as true, since we previously required backends to implement swapped memory operations. Future patches may now remove that support where it is onerous. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/aarch64: Return false on failure from patch_relocRichard Henderson2018-12-171-16/+21
| | | | | | | | This does require an extra two checks within the slow paths to replace the assert that we're moving. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Return success from patch_relocRichard Henderson2018-12-171-1/+2
| | | | | | | | | | | This will move the assert for success from within (subroutines of) patch_reloc into the callers. It will also let new code do something different when a relocation is out of range. For the moment, all backends are trivially converted to return true. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/aarch64: Fold away "noaddr" branch routinesRichard Henderson2018-12-171-19/+2Star
| | | | | | | | There are one use apiece for these. There is no longer a need for preserving branch offset operands, as we no longer re-translate. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/aarch64: Remove reloc_pc26_atomicRichard Henderson2018-12-171-12/+0Star
| | | | | | | It is unused since b68686bd4bfeb70040b4099df993dfa0b4f37b03. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/aarch64: limit mul_vec sizeAlex Bennée2018-07-191-1/+2
| | | | | | | | | | | | | | | | | | | | In AdvSIMD we can only do 32x32 integer multiples although SVE is capable of larger 64 bit multiples. As a result we can end up generating invalid opcodes. Fix this by only reprting we can emit mul vector ops if the size is small enough. Fixes a crash on: sve-all-short-v8.3+sve@vq3/insn_mul_z_zi___INC.risu.bin When running on AArch64 hardware. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20180719154248.29669-1-alex.bennee@linaro.org> [rth: Removed the tcg_debug_assert -- there are plenty of other cases that we do not diagnose within the insn encoding helpers.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Reduce max TB opcode countRichard Henderson2018-06-151-1/+1
| | | | | | | | | | | | | | | | | | | Also, assert that we don't overflow any of two different offsets into the TB. Both unwind and goto_tb both record a uint16_t for later use. This fixes an arm-softmmu test case utilizing NEON in which there is a TB generated that runs to 7800 opcodes, and compiles to 96k on an x86_64 host. This overflows the 16-bit offset in which we record the goto_tb reset offset. Because of that overflow, we install a jump destination that goes to neverland. Boom. With this reduced op count, the same TB compiles to about 48k for aarch64, ppc64le, and x86_64 hosts, and neither assertion fires. Cc: qemu-stable@nongnu.org Reported-by: "Jason A. Donenfeld" <Jason@zx2c4.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/aarch64: Add vector operationsRichard Henderson2018-02-083-47/+569
| | | | | Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/aarch64: Fully convert tcg_target_op_defRichard Henderson2017-09-171-131/+151
| | | | Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Remove tcg_regset_set32Richard Henderson2017-09-171-16/+17
| | | | | | | It's not even clear what the interface REG and VAL32 were supposed to mean. All uses had REG = 0 and VAL32 was the bitset assigned to the destination. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Remove tcg_regset_clearRichard Henderson2017-09-171-1/+1
| | | | | Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/aarch64: Use constant pool for moviRichard Henderson2017-09-072-30/+33
| | | | Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Rearrange ldst label trackingRichard Henderson2017-09-072-1/+6
| | | | | | | | | | Dispense with TCGBackendData, as it has never been used for more than holding a single pointer. Use a define in the cpu/tcg-target.h to signal requirement for TCGLabelQemuLdst, so that we can drop the no-op tcg-be-null.h stubs. Rename tcg-be-ldst.h to tcg-ldst.inc.c. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Move USE_DIRECT_JUMP discriminator to tcg/cpu/tcg-target.hRichard Henderson2017-09-072-9/+9
| | | | | | | | | | | | | | | | | Replace the USE_DIRECT_JUMP ifdef with a TCG_TARGET_HAS_direct_jump boolean test. Replace the tb_set_jmp_target1 ifdef with an unconditional function tb_target_set_jmp_target. While we're touching all backends, add a parameter for tb->tc_ptr; we're going to need it shortly for some backends. Move tb_set_jmp_target and tb_add_jump from exec-all.h to cpu-exec.c. This opens the possibility for TCG_TARGET_HAS_direct_jump to be a runtime decision -- based on host cpu capabilities, the size of code_gen_buffer, or a future debugging switch. Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Add tcg target default memory orderingPranith Kumar2017-09-051-0/+2
| | | | | | | Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20170829063313.10237-3-bobby.prani@gmail.com> [rth: Dropped ia64 hunk] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/aarch64: Enable indirect jump path using LDR (literal)Pranith Kumar2017-07-101-14/+28
| | | | | | | | | | | | This patch enables the indirect jump path using an LDR (literal) instruction. It will be interesting to test and see which performs better among the two paths. CC: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20170630143614.31059-3-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/aarch64: Use ADRP+ADD to compute target addressPranith Kumar2017-07-101-6/+30
| | | | | | | | | | | | | We use ADRP+ADD to compute the target address for goto_tb. This patch introduces the NOP instruction which is used to align the above instruction pair so that we can use one atomic instruction to patch the destination offsets. CC: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20170630143614.31059-2-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/aarch64: Introduce and use long branch to registerPranith Kumar2017-07-101-2/+13
| | | | | | | | | | | We can use a branch to register instruction for exit_tb for offsets greater than 128MB. CC: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20170630143614.31059-1-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/aarch64: Use ADR in tcg_out_moviRichard Henderson2017-06-191-1/+6
| | | | | | | | The new placement of the TB means that we can use one insn to load the return value for exit_tb returning the TB pointer. Tested-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/aarch64: Implement goto_ptrRichard Henderson2017-06-052-3/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Measurements: SPECint06 (test set), x86_64-linux-user. Host: APM 64-bit ARMv8 (Atlas/A57) @ 2.4 GHz 1.45x +-+-------------------------------------------------------------------------------------------------------------+-+ | ***** | | +++ * * +goto-ptr | 1.4x +-+...*****............................*...*....................................................................+-+ | *+++* * * +++ | 1.35x +-+...*...*............................*...*...........................*****....................................+-+ | * * * * *+++* | | * * * * * * | 1.3x +-+...*...*............................*...*...........................*...*....................................+-+ | * * * * * * | | * * * * * * ***** | 1.25x +-+...*...*...........*****............*...*...........................*...*............*****...*...*...........+-+ | * * * * * * * * *+++* * * | 1.2x +-+...*...*...........*...*............*...*...........................*...*............*...*...*...*...........+-+ | * * * * * * * * * * * * | | * * * * * * * * * * * * ***** | 1.15x +-+...*...*...........*...*............*...*...........................*...*............*...*...*...*...*...*...+-+ | * * * * * * * * +++ * * * * * * | | * * * * * * * * ***** * * * * * * | 1.1x +-+...*...*...........*...*....*****...*...*...*****...................*...*...*...*....*...*...*...*...*...*...+-+ | * * * * * * * * * * * * * * * * * * * * | 1.05x +-+...*...*...........*...*....*...*...*...*...*...*...................*...*...*...*....*...*...*...*...*...*...+-+ | * * ***** * * * * * * * * * * * * * * * * * * | | * * * * * * * * * * * * ***** ***** * * * * * * * * * * | 1x +-+---*****---*****---*****----*****---*****---*****---*****---*****---*****---*****----*****---*****---*****---+-+ astar bzip2 gcc gobmk h264ref hmmlibquantum mcf omnetpperlbench sjenxalancbmk hmean png: http://imgur.com/en9HE8L Tested-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Introduce goto_ptr opcode and tcg_gen_lookup_and_goto_ptrEmilio G. Cota2017-06-051-0/+1
| | | | | | | | | | | | | | | | | | Instead of exporting goto_ptr directly to TCG frontends, export tcg_gen_lookup_and_goto_ptr(), which calls goto_ptr with the pointer returned by the lookup_tb_ptr() helper. This is the only use case we have for goto_ptr and lookup_tb_ptr, so having this function is very convenient. Furthermore, it trivially allows us to avoid calling the lookup helper if goto_ptr is not implemented by the backend. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1493263764-18657-2-git-send-email-cota@braap.org> Message-Id: <1493263764-18657-3-git-send-email-cota@braap.org> Message-Id: <1493263764-18657-4-git-send-email-cota@braap.org> Message-Id: <1493263764-18657-5-git-send-email-cota@braap.org> [rth: Squashed 4 related commits.] Signed-off-by: Richard Henderson <rth@twiddle.net>
* aarch64: Change ext type to TCGType to fix warningsPranith Kumar2017-02-281-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To fix the following warnings: In file included from /users/pranith/qemu/tcg/tcg.c:255: /users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:879:24: warning: implicit conversion from enumeration type 'TCGMemOp' (aka 'enum TCGMemOp') to different enumeration type 'TCGType' (aka 'enum TCGType') [-Wenum-conversion] tcg_out_cmp(s, ext, a, b, b_const); ~~~~~~~~~~~ ^~~ /users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:893:36: warning: implicit conversion from enumeration type 'TCGMemOp' (aka 'enum TCGMemOp') to different enumeration type 'TCGType' (aka 'enum TCGType') [-Wenum-conversion] tcg_out_insn(s, 3201, CBZ, ext, a, offset); ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~ /users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:389:65: note: expanded from macro 'tcg_out_insn' glue(tcg_out_insn_,FMT)(S, glue(glue(glue(I,FMT),_),OP), ## __VA_ARGS__) ^ /users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:895:37: warning: implicit conversion from enumeration type 'TCGMemOp' (aka 'enum TCGMemOp') to different enumeration type 'TCGType' (aka 'enum TCGType') [-Wenum-conversion] tcg_out_insn(s, 3201, CBNZ, ext, a, offset); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~ /users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:389:65: note: expanded from macro 'tcg_out_insn' glue(tcg_out_insn_,FMT)(S, glue(glue(glue(I,FMT),_),OP), ## __VA_ARGS__) ^ /users/pranith/qemu/tcg/aarch64/tcg-target.inc.c:1610:27: warning: implicit conversion from enumeration type 'TCGType' (aka 'enum TCGType') to different enumeration type 'TCGMemOp' (aka 'enum TCGMemOp') [-Wenum-conversion] tcg_out_brcond(s, ext, a2, a0, a1, const_args[1], arg_label(args[3])); ~~~~~~~~~~~~~~ ^~~ Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Message-Id: <20170217154311.13920-1-bobby.prani@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/aarch64: Fix tcg_out_moviRichard Henderson2017-01-131-33/+24Star
| | | | | | | | | There were some patterns, like 0x0000_ffff_ffff_00ff, for which we would select to begin a multi-insn sequence with MOVN, but would fail to set the 0x0000 lane back from 0xffff. Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <20161207180727.6286-3-rth@twiddle.net>
* tcg/aarch64: Fix addsub2 for 0+CRichard Henderson2017-01-131-0/+9
| | | | | | | | When al == xzr, we cannot use addi/subi because that encodes xsp. Force a zero into the temp register for that (rare) case. Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <20161207180727.6286-2-rth@twiddle.net>
* tcg: Add opcode for ctpopRichard Henderson2017-01-101-0/+2
| | | | | | | | | The number of actual invocations of ctpop itself does not warrent an opcode, but it is very helpful for POWER7 to use in generating an expansion for ctz. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/aarch64: Handle ctz and clz opcodesRichard Henderson2017-01-102-4/+52
| | | | Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Add clz and ctz opcodesRichard Henderson2017-01-101-0/+4
| | | | | Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Pass the opcode width to target_parse_constraintRichard Henderson2017-01-101-10/+5Star
| | | | | | | | | | | | This will let us choose how to interpret a given constraint depending on whether the opcode is 32- or 64-bit. Which will let us share more constraint combinations between opcodes. At the same time, change the interface to return the advanced pointer instead of passing it in/out by reference. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Transition flat op_defs array to a target callbackRichard Henderson2017-01-101-2/+12
| | | | | | | | This will allow the target to tailor the constraints to the auto-detected ISA extensions. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/aarch64: Implement field extraction opcodesRichard Henderson2017-01-102-4/+18
| | | | | Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>