summaryrefslogtreecommitdiffstats
path: root/tcg/i386
Commit message (Collapse)AuthorAgeFilesLines
* tcg/i386: Fix build for systems without working cpuid.h (MacOSX, Win32)Peter Maydell2014-02-211-1/+3
| | | | | | | | | | | | Win32 doesn't have a cpuid.h, and MacOSX may have one but without the __cpuid() function we use, which means that commit 9d2eec20 broke the build for those platforms. Fix this by tightening up our configure cpuid.h check to test that the functions we need are present, and adding some missing #ifdef guards in tcg/i386/tcg-target.c. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net>
* tcg/i386: Use SHLX/SHRX/SARX instructionsRichard Henderson2014-02-171-11/+50
| | | | | | | | | | | | | | These three-operand shift instructions do not require the shift count to be placed into ECX. This reduces the number of mov insns required, with the mere addition of a new register constraint. Don't attempt to get rid of the matching constraint, as that's impossible to manipulate with just a new constraint. In addition, constant shifts still need the matching constraint. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/i386: Use ANDN instructionRichard Henderson2014-02-172-13/+45
| | | | | | | | Note that the optimizer cannot simplify ANDC X,Y,C to AND X,Y,~C so we must handle constants in the implementation of andc. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/i386: Add tcg_out_vex_modrmRichard Henderson2014-02-171-3/+38
| | | | | | | | Prepare for emitting BMI insns which require VEX encoding. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/i386: Move TCG_CT_CONST_* to tcg-target.cRichard Henderson2014-02-172-3/+4
| | | | | | | | | These are not needed by users of tcg-target.h. No need to recompile when we adjust them. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/i386: cleanup useless #ifdefAurelien Jarno2014-01-261-2/+0Star
| | | | | | | | | TCG_TARGET_HAS_movcond_i32 is always defined to 1 in tcg-target.h, so remove the corresponding #ifdef #endif sequence, left from a previous refactoring. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/i386: use movbe instruction in qemu_ldst routinesAurelien Jarno2014-01-261-37/+80
| | | | | | | | | | | | | | | The movbe instruction has been added on some Intel Atom CPUs and on recent Intel Haswell CPUs. It allows to load/store a value and at the same time bswap it. This patch detects the avaibility of this instruction and when available use it in the qemu load/store routines in replacement of load/store + bswap. Note that for 16-bit unsigned loads, movbe + movzw is basically the same as movzw + bswap, so the patch doesn't touch this case. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> [RTH: Reduced the number of conditionals using "movop".] Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/i386: add support for three-byte opcodesAurelien Jarno2014-01-251-8/+16
| | | | | | | | | | Add support for three-byte opcodes, starting with the 0x0f 0x38 prefix. Use P_EXT38 as the new constant, and shift all other constants so that P_EXT and P_EXT38 have neighbouring values. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> [RTH: Changed the name from P_EXT2 to P_EXT38.] Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/i386: remove hardcoded P_REXW valueAurelien Jarno2014-01-251-1/+1
| | | | | | | | | | | | | | P_REXW is defined has a constant at the beginning of i386/tcg-target.c, but the corresponding bit is later used in a harcoded way, which defeat the purpose of a constant. Fix that by using a conditional expression operator instead of a shift. On x86 this actually makes the code slightly smaller as GCC does in practice (opc >> 8) & 8 instead of (opc & 0x800) >> 8 so the constants are smaller to load. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/i386: fix a commentAurelien Jarno2013-12-211-1/+1
| | | | | | | The comments apply to 8-bit stores, not 8-byte stores. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
* tcg-i386: Support new ldst opcodesRichard Henderson2013-10-132-90/+51Star
| | | | | | | No support for helpers with non-default endianness yet, but good enough to test the opcodes. Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg-i386: Remove "cb" output restriction from qemu_st8 for i386Richard Henderson2013-10-131-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | Once we form a combined qemu_st_i32 opcode, we won't be able to have separate constraints based on size. This one is fairly easy to work around, since eax is available as a scratch register. When storing variable data, this tends to merely exchange one mov for another. E.g. -: mov %esi,%ecx ... -: mov %cl,(%edx) +: mov %esi,%eax +: mov %al,(%edx) Where we do have a regression is when storing constant data, in which we may load the constant into edi, when only ecx/ebx ought to be used. The proper way to recover this regression is to allow constants as arguments to qemu_st_i32, so that we never load the constant data into a register at all, must less the wrong register. TBD. Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg-i386: Tidy softmmu routinesRichard Henderson2013-10-131-249/+208Star
| | | | | | | | | | | Pass two TCGReg to tcg_out_tlb_load, rather than idx+args. Move ldst_optimization routines just below tcg_out_tlb_load to avoid the need for forward declarations. Use TCGReg enum in preference to int where apprpriate. Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg-i386: Use TCGMemOp within qemu_ldst routinesRichard Henderson2013-10-131-64/+59Star
| | | | | | Step one in the transition, with constants passed down from tcg_out_op. Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Add qemu_ld_st_i32/64Richard Henderson2013-10-101-0/+2
| | | | | | | Step two in the transition, adding the new ldst opcodes. Keep the old opcodes around until all backends support the new opcodes. Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Add tcg-be-ldst.hRichard Henderson2013-10-101-27/+3Star
| | | | | | Move TCGLabelQemuLdst and related stuff out of tcg.h. Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg-i386: Make use of zero-extended memory helper routinesRichard Henderson2013-09-021-9/+6Star
| | | | | | | | For 8 and 16-bit unsigned loads, rely on the zero-extension from the helper and use a smaller 32-bit move insn. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Introduce zero and sign-extended versions of load helpersRichard Henderson2013-09-021-3/+3
| | | | | Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* exec: Split softmmu_defs.hRichard Henderson2013-09-021-3/+0Star
| | | | | | | | | | | The _cmmu helpers can be moved to exec-all.h. The helpers that are used from TCG will shortly need access to tcg_target_long so move their declarations into tcg.h. This requires minor include adjustments to all TCG backends. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg-i386: Don't perform GETPC adjustment in TCG codeRichard Henderson2013-09-021-19/+14Star
| | | | | | | | | Since we now perform it inside the helper, no need to do it here. This also lets us perform a tail-call from the store slow path to the helper. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg-i386: Adjust tcg_out_tlb_load for x32Richard Henderson2013-09-021-14/+27
| | | | | Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg-i386: Use intptr_t appropriatelyRichard Henderson2013-09-021-22/+19Star
| | | | | Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Change tcg_out_ld/st offset to intptr_tRichard Henderson2013-09-021-2/+2
| | | | | Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Change relocation offsets to intptr_tRichard Henderson2013-09-021-1/+1
| | | | | Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Allow TCG_TARGET_REG_BITS to be specified independantlyRichard Henderson2013-09-021-4/+6
| | | | | | | | There are several hosts for which it would be useful to use the available 64-bit registers in a 32-bit pointer environment. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Change flush_icache_range arguments to uintptr_tRichard Henderson2013-09-021-2/+1Star
| | | | | Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Add muluh and mulsh opcodesRichard Henderson2013-09-021-0/+4
| | | | | | | | Use them in places where mulu2 and muls2 are used. Optimize mulx2 with dead low part to mulxh. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg-i386: Use new return-argument ld/st helpersRichard Henderson2013-08-261-56/+47Star
| | | | | | | | | Discontinue the jump-around-jump-to-jump scheme, trading it for a single immediate move instruction. The two extra jumps always consume 7 bytes, whereas the immediate move is either 5 or 7 bytes depending on where the code_gen_buffer gets located. Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg-i386: Tidy qemu_ld/st slow pathRichard Henderson2013-08-261-91/+74Star
| | | | | | | Use existing stack space for arguments; don't push/pop. Use less ifdefs and more C ifs. Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg-i386: Try pc-relative lea for constant formationRichard Henderson2013-08-261-5/+20
| | | | | | Use a 7 byte lea before the ultimate 10 byte movq. Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg-i386: Add and use tcg_out64Richard Henderson2013-08-261-2/+1Star
| | | | | | | No point in splitting the write into 32-bit pieces. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg-i386: Use QEMU_BUILD_BUG_ON instead of assert for frame sizeRichard Henderson2013-07-091-3/+3
| | | | | | | We can check the condition at compile time, rather than run time. Reviewed-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Move the CIE and FDE header definitions to common codeRichard Henderson2013-07-091-26/+13Star
| | | | | | | | These will necessarily be the same layout for all hosts. This limits the amount of boilerplate required to implement jit debug for a host. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Remove redundant tcg_target_init checksRichard Henderson2013-06-051-6/+0Star
| | | | | | | | We've got a compile-time check for the condition in exec/cpu-defs.h. Reviewed-by: Andreas Färber <afaerber@suse.de> Reviewed-by: liguang <lig.fnst@cn.fujitsu.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg-i386: Implement multiword arithmetic opsRichard Henderson2013-02-232-17/+26
| | | | | Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* tcg: Add signed multiword multiplication operationsRichard Henderson2013-02-231-0/+2
| | | | | Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* tcg: Add 64-bit multiword arithmetic operationsRichard Henderson2013-02-231-0/+3
| | | | | | Matching the 32-bit multiword arithmetic that we already have. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* tcg-i386: Always implement 32-bit multiword opsRichard Henderson2013-02-232-12/+13
| | | | | Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* tcg: Make 32-bit multiword operations optional for 64-bit hostsRichard Henderson2013-02-231-0/+4
| | | | | Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* tcg-i386: use LEA for 3-operand 64-bit additionPaolo Bonzini2013-01-121-1/+1
| | | | | | Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* tcg-i386: Perform cmov detection at runtime for 32-bit.Richard Henderson2012-12-292-6/+30
| | | | | | | | Existing compile-time detection is spotty at best. Convert it all to runtime detection instead. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* exec: move include files to include/exec/Paolo Bonzini2012-12-191-1/+1
| | | | Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* janitor: add guards to headersPaolo Bonzini2012-12-191-0/+3
| | | | Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* tcg: Optimize qemu_ld/st by generating slow paths at the end of a blockYeongkyoon Lee2012-11-031-126/+278
| | | | | | | | | Add optimized TCG qemu_ld/st generation which locates the code of TLB miss cases at the end of a block after generating the other IRs. Currently, this optimization supports only i386 and x86_64 hosts. Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* tcg-i386: Use %gs prefixes for x86_64 GUEST_BASERichard Henderson2012-10-281-56/+97
| | | | | | | | | | When we allocate a reserved_va for the guest, the kernel will likely choose an address well above 4G. At which point we must use a pair of movabsq+addq to form the host address. If we have OS support, set up a segment register to point to guest_base instead. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
* tcg/i386: remove ld/st third argument register constraintAurelien Jarno2012-10-281-6/+2Star
| | | | | | | | | | | | | | | | | On x86_64, remove the constraint on the third argument register which is not needed: - For loads the helper arguments are env, addr, mem_idx. The addr value should not be in the two first argument registers as they are used in tcg_out_tlb_load(). - For stores the helper arguments are env, addr, data, mem_idx. The addr and data values should not be in the two first argument registers as they are used in tcg_out_tlb_load(). The data value should also not be in the two first argument registers, but could be in the third argument register in which case it would be already loaded at the right location. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
* tcg/i386: remove suboptimal register shiftingAurelien Jarno2012-10-281-42/+31Star
| | | | | | | | | | | | | | | Now that CONFIG_TCG_PASS_AREG0 has been removed, it's easier to get an optimal code for the load/store functions. First swap the two registers used in tcg_out_tlb_load() so that the address end-up in the second register instead of the first one. Adjust tcg_out_qemu_ld() and tcg_out_qemu_st() to respectively call tcg_out_qemu_ld_direct() and tcg_out_qemu_st_direct() with the correct registers. Then replace the register shifting by direct load of the arguments. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
* tcg: Remove TCG_TARGET_HAS_GUEST_BASE definePeter Maydell2012-10-121-2/+0Star
| | | | | | | | | | GUEST_BASE support is now supported by all TCG backends, and is now mandatory. Drop the now-pointless TCG_TARGET_HAS_GUEST_BASE define (set by every backend) and the error if it is unset. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
* tcg: Add TCG_COND_NEVER, TCG_COND_ALWAYSRichard Henderson2012-10-061-1/+1
| | | | | | | | | | There are several cases that can be handled easier inside both translators and code generators if we have out-of-band values for conditions. It's easy enough to handle ALWAYS and NEVER in the natural way inside the tcg middle-end. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
* tcg: remove obsolete jmp opAurelien Jarno2012-10-061-9/+0Star
| | | | | | | | | | | | | The TCG jmp operation doesn't really make sense in the QEMU context, it is unused, it is not implemented by some targets, and it is wrongly implemented by some others. This patch simply removes it. Reviewed-by: Richard Henderson <rth@twiddle.net> Acked-by: Blue Swirl <blauwirbel@gmail.com> Acked-by: Stefan Weil<sw@weilnetz.de> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>