bwlp/qemu.git/target/ppc/fpu_helper.c, branch spice_video_codecs

target/ppc: Moved XSTSTDC[QDS]P to decodetree

2022-10-28T16:15:22+00:00

Moved XSTSTDCSP, XSTSTDCDP and XSTSTDCQP to decodetree and moved some of its decoding away from the helper as previously the DCMX, XB and BF were calculated in the helper with the help of cpu_env, now that part was moved to the decodetree with the rest. xvtstdcsp: rept loop master patch 8 12500 1,85393600 1,94683600 (+5.0%) 25 4000 1,78779800 1,92479000 (+7.7%) 100 1000 2,12775000 2,28895500 (+7.6%) 500 200 2,99655300 3,23102900 (+7.8%) 2500 40 6,89082200 7,44827500 (+8.1%) 8000 12 17,50585500 18,95152100 (+8.3%) xvtstdcdp: rept loop master patch 8 12500 1,39043100 1,33539800 (-4.0%) 25 4000 1,35731800 1,37347800 (+1.2%) 100 1000 1,51514800 1,56053000 (+3.0%) 500 200 2,21014400 2,47906000 (+12.2%) 2500 40 5,39488200 6,68766700 (+24.0%) 8000 12 13,98623900 18,17661900 (+30.0%) xvtstdcdp: rept loop master patch 8 12500 1,35123800 1,34455800 (-0.5%) 25 4000 1,36441200 1,36759600 (+0.2%) 100 1000 1,49763500 1,54138400 (+2.9%) 500 200 2,19020200 2,46196400 (+12.4%) 2500 40 5,39265700 6,68147900 (+23.9%) 8000 12 14,04163600 18,19669600 (+29.6%) As some values are now decoded outside the helper and passed to it as an argument the number of arguments of the helper increased, the number of TCGop needed to load the arguments increased. I suspect that's why the slow-down in the tests with a high REPT but low LOOP. Signed-off-by: Lucas Mateus Castro (alqotel) Reviewed-by: Richard Henderson Message-Id: <20221019125040.48028-12-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza

target/ppc: Moved XVTSTDC[DS]P to decodetree

2022-10-28T16:15:22+00:00

Moved XVTSTDCSP and XVTSTDCDP to decodetree an restructured the helper to be simpler and do all decoding in the decodetree (so XB, XT and DCMX are all calculated outside the helper). Obs: The tests in this one are slightly different, these are the sum of these instructions with all possible immediate and those instructions are repeated 10 times. xvtstdcsp: rept loop master patch 8 12500 2,76402100 2,70699100 (-2.1%) 25 4000 2,64867100 2,67884100 (+1.1%) 100 1000 2,73806300 2,78701000 (+1.8%) 500 200 3,44666500 3,61027600 (+4.7%) 2500 40 5,85790200 6,47475500 (+10.5%) 8000 12 15,22102100 17,46062900 (+14.7%) xvtstdcdp: rept loop master patch 8 12500 2,11818000 1,61065300 (-24.0%) 25 4000 2,04573400 1,60132200 (-21.7%) 100 1000 2,13834100 1,69988100 (-20.5%) 500 200 2,73977000 2,48631700 (-9.3%) 2500 40 5,05067000 5,25914100 (+4.1%) 8000 12 14,60507800 15,93704900 (+9.1%) Signed-off-by: Lucas Mateus Castro (alqotel) Reviewed-by: Richard Henderson Message-Id: <20221019125040.48028-11-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza

target/ppc: Clear fpstatus flags on helpers missing it

2022-09-20T13:54:06+00:00

In ppc emulation, exception flags are not cleared at the end of an instruction. Instead, the next instruction is responsible to clear it before its emulation. However, some helpers are not doing it, causing an issue where the previously set exception flags are being used and leading to incorrect values being set in FPSCR. Fix this by clearing fp_status before doing the instruction 'real' work for the following helpers that were missing this behavior: - VSX_CVT_INT_TO_FP_VECTOR - VSX_CVT_FP_TO_FP - VSX_CVT_FP_TO_INT_VECTOR - VSX_CVT_FP_TO_INT2 - VSX_CVT_FP_TO_INT - VSX_CVT_FP_TO_FP_HP - VSX_CVT_FP_TO_FP_VECTOR - VSX_CMP - VSX_ROUND - xscvqpdp - xscvdpsp[n] Signed-off-by: Víctor Colombo Reviewed-by: Daniel Henrique Barboza Message-Id: <20220906125523.38765-9-victor.colombo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza

target/ppc: Zero second doubleword for VSX madd instructions

2022-09-20T13:54:06+00:00

In 205eb5a89e we updated most VSX instructions to zero the second doubleword, as is requested by PowerISA since v3.1. However, VSX_MADD helper was left behind unchanged, while it is also affected and should be fixed as well. This patch applies the fix for MADD instructions. Fixes: 205eb5a89e ("target/ppc: Change VSX instructions behavior to fill with zeros") Signed-off-by: Víctor Colombo Reviewed-by: Daniel Henrique Barboza Message-Id: <20220906125523.38765-6-victor.colombo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza

target/ppc: Merge fsqrt and fsqrts helpers

2022-09-20T13:54:06+00:00

These two helpers are almost identical, differing only by the softfloat operation it calls. Merge them into one using a macro. Also, take this opportunity to capitalize the helper name as we moved the instruction to decodetree in a previous patch. Signed-off-by: Víctor Colombo Reviewed-by: Richard Henderson Message-Id: <20220905123746.54659-4-victor.colombo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza

target/ppc: Bugfix FP when OE/UE are set

2022-08-31T17:08:05+00:00

When an overflow exception occurs and OE is set the intermediate result should be adjusted (by subtracting from the exponent) to avoid rounding to inf. The same applies to an underflow exceptionion and UE (but adding to the exponent). To do this set the fp_status.rebias_overflow when OE is set and fp_status.rebias_underflow when UE is set as the FPU will recalculate in case of a overflow/underflow if the according rebias* is set. Signed-off-by: Lucas Mateus Castro (alqotel) Reviewed-by: Richard Henderson Message-Id: <20220805141522.412864-3-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza

target/ppc: fix unreachable code in fpu_helper.c

2022-06-20T11:38:58+00:00

Commit c29018cc7395 added an env->fpscr OR operation using a ternary that checks if 'error' is not zero: env->fpscr |= error ? FP_FEX : 0; However, in the current body of do_fpscr_check_status(), 'error' is granted to be always non-zero at that point. The result is that Coverity is less than pleased: Control flow issues (DEADCODE) Execution cannot reach the expression "0ULL" inside this statement: "env->fpscr |= (error ? 1073...". Remove the ternary and always make env->fpscr |= FP_FEX. Cc: Lucas Mateus Castro (alqotel) Cc: Richard Henderson Fixes: Coverity CID 1489442 Fixes: c29018cc7395 ("target/ppc: Implemented xvf*ger*") Signed-off-by: Daniel Henrique Barboza Reviewed-by: Lucas Mateus Castro (alqotel) Message-Id: <20220602191048.137511-1-danielhb413@gmail.com> Signed-off-by: Daniel Henrique Barboza

target/ppc: Implemented [pm]xvbf16ger2*

2022-05-26T20:11:33+00:00

Implement the following PowerISA v3.1 instructions: xvbf16ger2: VSX Vector bfloat16 GER (rank-2 update) xvbf16ger2nn: VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Negative accumulate xvbf16ger2np: VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Positive accumulate xvbf16ger2pn: VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Negative accumulate xvbf16ger2pp: VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Positive accumulate pmxvbf16ger2: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) pmxvbf16ger2nn: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Negative accumulate pmxvbf16ger2np: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Positive accumulate pmxvbf16ger2pn: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Negative accumulate pmxvbf16ger2pp: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) Reviewed-by: Richard Henderson Message-Id: <20220524140537.27451-8-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza

target/ppc: Implemented xvf16ger*

2022-05-26T20:11:33+00:00

Implement the following PowerISA v3.1 instructions: xvf16ger2: VSX Vector 16-bit Floating-Point GER (rank-2 update) xvf16ger2nn: VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative multiply, Negative accumulate xvf16ger2np: VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative multiply, Positive accumulate xvf16ger2pn: VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive multiply, Negative accumulate xvf16ger2pp: VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) Reviewed-by: Richard Henderson Message-Id: <20220524140537.27451-6-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza

target/ppc: Implemented xvfger

2022-05-26T20:11:33+00:00

Implement the following PowerISA v3.1 instructions: xvf32ger: VSX Vector 32-bit Floating-Point GER (rank-1 update) xvf32gernn: VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative multiply, Negative accumulate xvf32gernp: VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative multiply, Positive accumulate xvf32gerpn: VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive multiply, Negative accumulate xvf32gerpp: VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive multiply, Positive accumulate xvf64ger: VSX Vector 64-bit Floating-Point GER (rank-1 update) xvf64gernn: VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative multiply, Negative accumulate xvf64gernp: VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative multiply, Positive accumulate xvf64gerpn: VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive multiply, Negative accumulate xvf64gerpp: VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) Reviewed-by: Richard Henderson Message-Id: <20220524140537.27451-5-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza