bwlp/qemu.git - Experimental fork of QEMU with video encoding patches

	Commit message (Collapse)	Author	Age	Files	Lines
*	target/arm: Speed up aarch64 TBL/TBX	Richard Henderson	2021-03-05	1	-0/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Always perform one call instead of two for 16-byte operands. Use byte loads/stores directly into the vector register file instead of extractions and deposits to a 64-bit local variable. In order to easily receive pointers into the vector register file, convert the helper to the gvec out-of-line signature. Move the helper into vec_helper.c, where it can make use of H1 and clear_tail. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20210224230532.276878-1-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	arm tcg cpus: Fix Lesser GPL version number	Chetan Pant	2020-11-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	There is no "version 2" of the "Lesser" General Public License. It is either "GPL version 2.0" or "Lesser GPL version 2.1". This patch replaces all occurrences of "Lesser GPL version 2" with "Lesser GPL version 2.1" in comment section. Signed-off-by: Chetan Pant <chetan4windows@gmail.com> Message-Id: <20201023122913.19561-1-chetan4windows@gmail.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
*	target/arm: Fix VUDOT/VSDOT (scalar) on big-endian hosts	Peter Maydell	2020-11-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The helper functions for performing the udot/sdot operations against a scalar were not using an address-swizzling macro when converting the index of the scalar element into a pointer into the vm array. This had no effect on little-endian hosts but meant we generated incorrect results on big-endian hosts. For these insns, the index is indexing over group of 4 8-bit values, so 32 bits per indexed entity, and H4() is therefore what we want. (For Neon the only possible input indexes are 0 and 1.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20201028191712.4910-3-peter.maydell@linaro.org
*	target/arm: Fix float16 pairwise Neon ops on big-endian hosts	Peter Maydell	2020-11-02	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	In the neon_padd/pmax/pmin helpers for float16, a cut-and-paste error meant we were using the H4() address swizzler macro rather than the H2() which is required for 2-byte data. This had no effect on little-endian hosts but meant we put the result data into the destination Dreg in the wrong order on big-endian hosts. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20201028191712.4910-2-peter.maydell@linaro.org
*	target/arm/vec_helper: Add gvec fp indexed multiply-and-add operations	Peter Maydell	2020-09-01	1	-5/+22
\| \| \| \| \| \| \| \|	Add gvec helpers for doing Neon-style indexed non-fused fp multiply-and-accumulate operations. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20200828183354.27913-44-peter.maydell@linaro.org
*	target/arm/vec_helper: Handle oprsz less than 16 bytes in indexed operations	Peter Maydell	2020-09-01	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	In the gvec helper functions for indexed operations, for AArch32 Neon the oprsz (total size of the vector) can be less than 16 bytes if the operation is on a D reg. Since the inner loop in these helpers always goes from 0 to segment, we must clamp it based on oprsz to avoid processing a full 16 byte segment when asked to handle an 8 byte wide vector. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-43-peter.maydell@linaro.org
*	target/arm: Implement fp16 for Neon VRINTX	Peter Maydell	2020-09-01	1	-0/+3
\| \| \| \| \| \| \| \| \|	Convert the Neon VRINTX insn to use gvec, and use this to implement fp16 support for it. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-42-peter.maydell@linaro.org
*	target/arm: Implement fp16 for Neon VRINT-with-specified-rounding-mode	Peter Maydell	2020-09-01	1	-0/+21
\| \| \| \| \| \| \| \| \|	Convert the Neon VRINT-with-specified-rounding-mode insns to gvec, and use this to implement the fp16 versions. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-41-peter.maydell@linaro.org
*	target/arm: Implement fp16 for Neon VCVT with rounding modes	Peter Maydell	2020-09-01	1	-0/+23
\| \| \| \| \| \| \| \| \|	Convert the Neon VCVT with-specified-rounding-mode instructions to gvec, and use this to implement fp16 support for them. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-40-peter.maydell@linaro.org
*	target/arm: Implement fp16 for Neon VCVT fixed-point	Peter Maydell	2020-09-01	1	-0/+4
\| \| \| \| \| \| \| \| \|	Implement fp16 for the Neon VCVT insns which convert between float and fixed-point. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-39-peter.maydell@linaro.org
*	target/arm: Convert Neon VCVT fixed-point to gvec	Peter Maydell	2020-09-01	1	-0/+20
\| \| \| \| \| \| \| \| \|	Convert the Neon VCVT float<->fixed-point insns to a gvec style, in preparation for adding fp16 support. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-38-peter.maydell@linaro.org
*	target/arm: Implement fp16 for Neon float-integer VCVT	Peter Maydell	2020-09-01	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Convert the Neon float-integer VCVT insns to gvec, and use this to implement fp16 support for them. Note that unlike the VFP int<->fp16 VCVT insns we converted earlier and which convert to/from a 32-bit integer, these Neon insns convert to/from 16-bit integers. So we can use the existing vfp conversion helpers for the f32<->u32/i32 case but need to provide our own for f16<->u16/i16. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-37-peter.maydell@linaro.org
*	target/arm: Implement fp16 for Neon pairwise fp ops	Peter Maydell	2020-09-01	1	-0/+45
\| \| \| \| \| \| \| \| \| \| \|	Convert the Neon pairwise fp ops to use a single gvic-style helper to do the full operation instead of one helper call for each 32-bit part. This allows us to use the same framework to implement the fp16. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-36-peter.maydell@linaro.org
*	target/arm: Implement fp16 for Neon VRSQRTS	Peter Maydell	2020-09-01	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \|	Convert the Neon VRSQRTS insn to using a gvec helper, and use this to implement the fp16 case. As with VRECPS, we adjust the phrasing of the new implementation slightly so that the fp32 version parallels the fp16 one. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-35-peter.maydell@linaro.org
*	target/arm: Implement fp16 for Neon VRECPS	Peter Maydell	2020-09-01	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \|	Convert the Neon VRECPS insn to using a gvec helper, and use this to implement the fp16 case. The phrasing of the new float32_recps_nf() is slightly different from the old recps_f32() so that it parallels the f16 version; for f16 we can't assume that flush-to-zero is always enabled. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-34-peter.maydell@linaro.org
*	target/arm: Implement fp16 for Neon fp compare-vs-0	Peter Maydell	2020-09-01	1	-0/+25
\| \| \| \| \| \| \| \| \| \|	Convert the neon floating-point vector compare-vs-0 insns VCEQ0, VCGT0, VCLE0, VCGE0 and VCLT0 to use a gvec helper, and use this to implement the fp16 case. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-33-peter.maydell@linaro.org
*	target/arm: Implement fp16 for Neon VFMA, VMFS	Peter Maydell	2020-09-01	1	-1/+32
\| \| \| \| \| \| \| \| \| \| \| \|	Convert the neon floating-point vector operations VFMA and VFMS to use a gvec helper, and use this to implement the fp16 case. This is the last use of do_3same_fp() so we can now delete that function. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-32-peter.maydell@linaro.org
*	target/arm: Implement fp16 for Neon VMLA, VMLS operations	Peter Maydell	2020-09-01	1	-0/+42
\| \| \| \| \| \| \| \| \|	Convert the Neon floating-point VMLA and VMLS insns over to using a gvec helper, and use this to implement the fp16 case. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-31-peter.maydell@linaro.org
*	target/arm: Implement fp16 for Neon VMAXNM, VMINNM	Peter Maydell	2020-09-01	1	-0/+6
\| \| \| \| \| \| \| \| \|	Convert the Neon floating point VMAXNM and VMINNM insns to using a gvec helper and use this to implement the fp16 case. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-30-peter.maydell@linaro.org
*	target/arm: Implement fp16 for Neon VMAX, VMIN	Peter Maydell	2020-09-01	1	-0/+6
\| \| \| \| \| \| \| \| \|	Convert the Neon float-point VMAX and VMIN insns over to using a gvec helper, and use this to implement the fp16 case. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-29-peter.maydell@linaro.org
*	target/arm: Implement fp16 for VACGE, VACGT	Peter Maydell	2020-09-01	1	-0/+26
\| \| \| \| \| \| \| \| \| \|	Convert the neon floating-point vector absolute comparison ops VACGE and VACGT over to using a gvec hepler and use this to implement the fp16 case. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-28-peter.maydell@linaro.org
*	target/arm: Implement fp16 for VCEQ, VCGE, VCGT comparisons	Peter Maydell	2020-09-01	1	-0/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Convert the Neon floating-point vector comparison ops VCEQ, VCGE and VCGT over to using a gvec helper and use this to implement the fp16 case. (We put the float16_ceq() etc functions above the DO_2OP() macro definition because later when we convert the compare-against-zero instructions we'll want their definitions to be visible at that point in the source file.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-27-peter.maydell@linaro.org
*	target/arm: Implement FP16 for Neon VADD, VSUB, VABD, VMUL	Peter Maydell	2020-09-01	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \|	Implement FP16 support for the Neon insns which use the DO_3S_FP_GVEC macro: VADD, VSUB, VABD, VMUL. For VABD this requires us to implement a new gvec_fabd_h helper using the machinery we have already for the other helpers. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200828183354.27913-24-peter.maydell@linaro.org
*	target/arm: Convert sq{, r}dmulh to gvec for aa64 advsimd	Richard Henderson	2020-08-28	1	-0/+48
\| \| \| \| \| \| \|	Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20200815013145.539409-21-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Convert integer multiply-add (indexed) to gvec for aa64 advsimd	Richard Henderson	2020-08-28	1	-0/+25
\| \| \| \| \| \| \|	Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20200815013145.539409-20-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Convert integer multiply (indexed) to gvec for aa64 advsimd	Richard Henderson	2020-08-28	1	-4/+25
\| \| \| \| \| \| \|	Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20200815013145.539409-19-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Generalize inl_qrdmlah_* helper functions	Richard Henderson	2020-08-28	1	-51/+29
\| \| \| \| \| \| \| \| \| \| \|	Unify add/sub helpers and add a parameter for rounding. This will allow saturating non-rounding to reuse this code. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> [PMM: fixed accidental use of '=' rather than '+=' in do_sqrdmlah_s] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20200815013145.539409-15-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Convert aes and sm4 to gvec helpers	Richard Henderson	2020-06-05	1	-11/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With this conversion, we will be able to use the same helpers with sve. In particular, pass 3 vector parameters for the 3-operand operations; for advsimd the destination register is also an input. This also fixes a bug in which we failed to clear the high bits of the SVE register after an AdvSIMD operation. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200514212831.31248-2-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Convert Neon VADD, VSUB, VABD 3-reg-same insns to decodetree	Peter Maydell	2020-05-14	1	-0/+7
\| \| \| \| \| \| \| \| \| \|	Convert the Neon VADD, VSUB, VABD 3-reg-same insns to decodetree. We already have gvec helpers for addition and subtraction, but must add one for fabd. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200512163904.10918-12-peter.maydell@linaro.org
*	target/arm: Vectorize SABA/UABA	Richard Henderson	2020-05-14	1	-0/+24
\| \| \| \| \| \| \| \| \|	Include 64-bit element size in preparation for SVE2. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200513163245.17915-17-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Vectorize SABD/UABD	Richard Henderson	2020-05-14	1	-0/+24
\| \| \| \| \| \| \| \| \|	Include 64-bit element size in preparation for SVE2. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200513163245.17915-16-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Clear tail in gvec_fmul_idx_, gvec_fmla_idx_	Richard Henderson	2020-05-14	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	Must clear the tail for AdvSIMD when SVE is enabled. Fixes: ca40a6e6e39 Cc: qemu-stable@nongnu.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200513163245.17915-15-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Pass pointer to qc to qrdmla/qrdmls	Richard Henderson	2020-05-14	1	-30/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pass a pointer directly to env->vfp.qc[0], rather than env. This will allow SVE2, which does not modify QC, to pass a pointer to dummy storage. Change the return type of inl_qrdml.h_s16 to match the sense of the operation: signed. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200513163245.17915-14-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Create gen_gvec_{sri,sli}	Richard Henderson	2020-05-14	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The functions eliminate duplication of the special cases for this operation. They match up with the GVecGen2iFn typedef. Add out-of-line helpers. We got away with only having inline expanders because the neon vector size is only 16 bytes, and we know that the inline expansion will always succeed. When we reuse this for SVE, tcg-gvec-op may decide to use an out-of-line helper due to longer vector lengths. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200513163245.17915-4-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Create gen_gvec_{u,s}{rshr,rsra}	Richard Henderson	2020-05-14	1	-0/+50
\| \| \| \| \| \| \| \| \| \| \|	Create vectorized versions of handle_shri_with_rndacc for shift+round and shift+round+accumulate. Add out-of-line helpers in preparation for longer vector lengths from SVE. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200513163245.17915-3-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Create gen_gvec_[us]sra	Richard Henderson	2020-05-14	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The functions eliminate duplication of the special cases for this operation. They match up with the GVecGen2iFn typedef. Add out-of-line helpers. We got away with only having inline expanders because the neon vector size is only 16 bytes, and we know that the inline expansion will always succeed. When we reuse this for SVE, tcg-gvec-op may decide to use an out-of-line helper due to longer vector lengths. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200513163245.17915-2-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Vectorize integer comparison vs zero	Richard Henderson	2020-04-30	1	-0/+25
\| \| \| \| \| \| \| \| \| \|	These instructions are often used in glibc's string routines. They were the final uses of the 32-bit at a time neon helpers. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200418162808.4680-1-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Convert PMULL.8 to gvec	Richard Henderson	2020-02-21	1	-0/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We still need two different helpers, since NEON and SVE2 get the inputs from different locations within the source vector. However, we can convert both to the same internal form for computation. The sve2 helper is not used yet, but adding it with this patch helps illustrate why the neon changes are helpful. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200216214232.4230-5-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Convert PMULL.64 to gvec	Richard Henderson	2020-02-21	1	-0/+33
\| \| \| \| \| \| \| \| \| \|	The gvec form will be needed for implementing SVE2. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200216214232.4230-4-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Convert PMUL.8 to gvec	Richard Henderson	2020-02-21	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The gvec form will be needed for implementing SVE2. Extend the implementation to operate on uint64_t instead of uint32_t. Use a counted inner loop instead of terminating when op1 goes to zero, looking toward the required implementation for ARMv8.4-DIT. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200216214232.4230-3-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Vectorize USHL and SSHL	Richard Henderson	2020-02-21	1	-0/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These instructions shift left or right depending on the sign of the input, and 7 bits are significant to the shift. This requires several masks and selects in addition to the actual shifts to form the complete answer. That said, the operation is still a small improvement even for two 64-bit elements -- 13 vector operations instead of 2 * 7 integer operations. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20200216214232.4230-2-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Add helpers for FMLAL	Richard Henderson	2019-02-28	1	-0/+148
\| \| \| \| \| \| \| \| \| \| \| \|	Note that float16_to_float32 rightly squashes SNaN to QNaN. But of course pickNaNMulAdd, for ARM, selects SNaNs first. So we have to preserve SNaN long enough for the correct NaN to be selected. Thus float16_to_float32_by_bits. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190219222952.22183-2-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Add missing clear_tail calls	Richard Henderson	2019-02-15	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	Fortunately, the functions affected are so far only called from SVE, so there is no tail to be cleared. But as we convert more of AdvSIMD to gvec, this will matter. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190209033847.9014-13-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Use vector operations for saturation	Richard Henderson	2019-02-15	1	-0/+130
\| \| \| \| \| \| \| \| \| \| \|	For same-sign saturation, we have tcg vector operations. We can compute the QC bit by comparing the saturated value against the unsaturated value. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190209033847.9014-12-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Split out FPSCR.QC to a vector field	Richard Henderson	2019-02-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Change the representation of this field such that it is easy to set from vector code. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190209033847.9014-11-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Implement SVE dot product (indexed)	Richard Henderson	2018-06-29	1	-0/+124
\| \| \| \| \| \| \| \|	Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20180627043328.11531-34-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Implement SVE dot product (vectors)	Richard Henderson	2018-06-29	1	-0/+67
\| \| \| \| \| \| \| \|	Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180627043328.11531-33-richard.henderson@linaro.org [PMM: moved 'ra=%reg_movprfx' here from following patch] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Implement SVE fp complex multiply add (indexed)	Richard Henderson	2018-06-29	1	-20/+30
\| \| \| \| \| \| \| \| \| \| \| \|	Enhance the existing helpers to support SVE, which takes the index from each 128-bit segment. The change has no effect for AdvSIMD, since there is only one such segment. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20180627043328.11531-32-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Pass index to AdvSIMD FCMLA (indexed)	Richard Henderson	2018-06-29	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For aa64 advsimd, we had been passing the pre-indexed vector. However, sve applies the index to each 128-bit segment, so we need to pass in the index separately. For aa32 advsimd, the fp32 operation always has index 0, but we failed to interpret the fp16 index correctly. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20180627043328.11531-31-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
*	target/arm: Implement SVE Floating Point Unary Operations - Unpredicated Group	Richard Henderson	2018-06-29	1	-0/+20
\| \| \| \| \| \| \|	Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180627043328.11531-21-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>