diff options
author | Linus Torvalds | 2017-11-17 23:34:42 +0100 |
---|---|---|
committer | Linus Torvalds | 2017-11-17 23:34:42 +0100 |
commit | f6705bf959efac87bca76d40050d342f1d212587 (patch) | |
tree | e199b124c6067a92be7f4727538ffc721670fc28 /drivers/gpu/drm/amd/display/dc/basics/fixpt31_32.c | |
parent | Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/l... (diff) | |
parent | Merge branch 'drm-next-4.15-dc' of git://people.freedesktop.org/~agd5f/linux ... (diff) | |
download | kernel-qcow2-linux-f6705bf959efac87bca76d40050d342f1d212587.tar.gz kernel-qcow2-linux-f6705bf959efac87bca76d40050d342f1d212587.tar.xz kernel-qcow2-linux-f6705bf959efac87bca76d40050d342f1d212587.zip |
Merge tag 'drm-for-v4.15-amd-dc' of git://people.freedesktop.org/~airlied/linux
Pull amdgpu DC display code for Vega from Dave Airlie:
"This is the pull request for the AMD DC (display code) layer which is
a requirement to program the display engines on the new Vega and Raven
based GPUs. It also contains support for all amdgpu supported GPUs
(CIK, VI, Polaris), which has to be enabled. It is also a kms atomic
modesetting compatible driver (unlike the current in-tree display
code).
I've kept it separate from drm-next because it may have some things
that cause you to reject it.
Background story:
AMD have an internal team creating a shared OS codebase for display at
hw bring up time using information from their hardware teams. This
process doesn't lead to the most Linux friendly/looking code but we
have worked together on cleaning a lot of it up and dealing with
sparse/smatch/checkpatch, and having their team internally adhere to
Linux coding standards.
This tree is a complete history rebased since they started opening it,
we decided not to squash it down as the history may have some value.
Some of the commits therefore might not reach kernel standards, and we
are steadily training people in AMD to better write commit msgs.
There is a major bunch of generated bandwidth calculation and
verification code that comes from their hardware team. On Vega and
before this is float calculations, on Raven (DCN10) this is double
based. They do the required things to do FP in the kernel, and I could
understand this might raise some issues. Rewriting the bandwidth would
be a major undertaken in reverification, it's non-trivial to work out
if a display can handle the complete set of mode information thrown at
it.
Future story:
There is a TODO list with this, and it address most of the remaining
things that would be nice to refine/remove. The DCN10 code is still
under development internally and they push out a lot of patches quite
regularly and are supporting this code base with their display team. I
think we've reached the point where keeping it out of tree is going to
motivate distributions to start carrying the code, so I'd prefer we
get it in tree. I think this code is slightly better than STAGING
quality but not massively so, I'd really like to see that float/double
magic gone and fixed point used, but AMD don't seem to think the
accuracy and revalidation of the code is worth the effort"
* tag 'drm-for-v4.15-amd-dc' of git://people.freedesktop.org/~airlied/linux: (1110 commits)
drm/amd/display: fix MST link training fail division by 0
drm/amd/display: Fix formatting for null pointer dereference fix
drm/amd/display: Remove dangling planes on dc commit state
drm/amd/display: add flip_immediate to commit update for stream
drm/amd/display: Miss register MST encoder cbs
drm/amd/display: Fix warnings on S3 resume
drm/amd/display: use num_timing_generator instead of pipe_count
drm/amd/display: use configurable FBC option in dm
drm/amd/display: fix AZ clock not enabled before program AZ endpoint
amdgpu/dm: Don't use DRM_ERROR in amdgpu_dm_atomic_check
amd/display: Fix potential null dereference in dce_calcs.c
amdgpu/dm: Remove unused forward declaration
drm/amdgpu: Remove unused dc_stream from amdgpu_crtc
amdgpu/dc: Fix double unlock in amdgpu_dm_commit_planes
amdgpu/dc: Fix missing null checks in amdgpu_dm.c
amdgpu/dc: Fix potential null dereferences in amdgpu_dm.c
amdgpu/dc: fix more indentation warnings
amdgpu/dc: handle allocation failures in dc_commit_planes_to_stream.
amdgpu/dc: fix indentation warning from smatch.
amdgpu/dc: fix non-ansi function decls.
...
Diffstat (limited to 'drivers/gpu/drm/amd/display/dc/basics/fixpt31_32.c')
-rw-r--r-- | drivers/gpu/drm/amd/display/dc/basics/fixpt31_32.c | 567 |
1 files changed, 567 insertions, 0 deletions
diff --git a/drivers/gpu/drm/amd/display/dc/basics/fixpt31_32.c b/drivers/gpu/drm/amd/display/dc/basics/fixpt31_32.c new file mode 100644 index 000000000000..26936892c6f5 --- /dev/null +++ b/drivers/gpu/drm/amd/display/dc/basics/fixpt31_32.c @@ -0,0 +1,567 @@ +/* + * Copyright 2012-15 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: AMD + * + */ + +#include "dm_services.h" +#include "include/fixed31_32.h" + +static inline uint64_t abs_i64( + int64_t arg) +{ + if (arg > 0) + return (uint64_t)arg; + else + return (uint64_t)(-arg); +} + +/* + * @brief + * result = dividend / divisor + * *remainder = dividend % divisor + */ +static inline uint64_t complete_integer_division_u64( + uint64_t dividend, + uint64_t divisor, + uint64_t *remainder) +{ + uint64_t result; + + ASSERT(divisor); + + result = div64_u64_rem(dividend, divisor, remainder); + + return result; +} + + +#define FRACTIONAL_PART_MASK \ + ((1ULL << FIXED31_32_BITS_PER_FRACTIONAL_PART) - 1) + +#define GET_INTEGER_PART(x) \ + ((x) >> FIXED31_32_BITS_PER_FRACTIONAL_PART) + +#define GET_FRACTIONAL_PART(x) \ + (FRACTIONAL_PART_MASK & (x)) + +struct fixed31_32 dal_fixed31_32_from_fraction( + int64_t numerator, + int64_t denominator) +{ + struct fixed31_32 res; + + bool arg1_negative = numerator < 0; + bool arg2_negative = denominator < 0; + + uint64_t arg1_value = arg1_negative ? -numerator : numerator; + uint64_t arg2_value = arg2_negative ? -denominator : denominator; + + uint64_t remainder; + + /* determine integer part */ + + uint64_t res_value = complete_integer_division_u64( + arg1_value, arg2_value, &remainder); + + ASSERT(res_value <= LONG_MAX); + + /* determine fractional part */ + { + uint32_t i = FIXED31_32_BITS_PER_FRACTIONAL_PART; + + do { + remainder <<= 1; + + res_value <<= 1; + + if (remainder >= arg2_value) { + res_value |= 1; + remainder -= arg2_value; + } + } while (--i != 0); + } + + /* round up LSB */ + { + uint64_t summand = (remainder << 1) >= arg2_value; + + ASSERT(res_value <= LLONG_MAX - summand); + + res_value += summand; + } + + res.value = (int64_t)res_value; + + if (arg1_negative ^ arg2_negative) + res.value = -res.value; + + return res; +} + +struct fixed31_32 dal_fixed31_32_from_int_nonconst( + int64_t arg) +{ + struct fixed31_32 res; + + ASSERT((LONG_MIN <= arg) && (arg <= LONG_MAX)); + + res.value = arg << FIXED31_32_BITS_PER_FRACTIONAL_PART; + + return res; +} + +struct fixed31_32 dal_fixed31_32_shl( + struct fixed31_32 arg, + uint8_t shift) +{ + struct fixed31_32 res; + + ASSERT(((arg.value >= 0) && (arg.value <= LLONG_MAX >> shift)) || + ((arg.value < 0) && (arg.value >= LLONG_MIN >> shift))); + + res.value = arg.value << shift; + + return res; +} + +struct fixed31_32 dal_fixed31_32_add( + struct fixed31_32 arg1, + struct fixed31_32 arg2) +{ + struct fixed31_32 res; + + ASSERT(((arg1.value >= 0) && (LLONG_MAX - arg1.value >= arg2.value)) || + ((arg1.value < 0) && (LLONG_MIN - arg1.value <= arg2.value))); + + res.value = arg1.value + arg2.value; + + return res; +} + +struct fixed31_32 dal_fixed31_32_sub( + struct fixed31_32 arg1, + struct fixed31_32 arg2) +{ + struct fixed31_32 res; + + ASSERT(((arg2.value >= 0) && (LLONG_MIN + arg2.value <= arg1.value)) || + ((arg2.value < 0) && (LLONG_MAX + arg2.value >= arg1.value))); + + res.value = arg1.value - arg2.value; + + return res; +} + +struct fixed31_32 dal_fixed31_32_mul( + struct fixed31_32 arg1, + struct fixed31_32 arg2) +{ + struct fixed31_32 res; + + bool arg1_negative = arg1.value < 0; + bool arg2_negative = arg2.value < 0; + + uint64_t arg1_value = arg1_negative ? -arg1.value : arg1.value; + uint64_t arg2_value = arg2_negative ? -arg2.value : arg2.value; + + uint64_t arg1_int = GET_INTEGER_PART(arg1_value); + uint64_t arg2_int = GET_INTEGER_PART(arg2_value); + + uint64_t arg1_fra = GET_FRACTIONAL_PART(arg1_value); + uint64_t arg2_fra = GET_FRACTIONAL_PART(arg2_value); + + uint64_t tmp; + + res.value = arg1_int * arg2_int; + + ASSERT(res.value <= LONG_MAX); + + res.value <<= FIXED31_32_BITS_PER_FRACTIONAL_PART; + + tmp = arg1_int * arg2_fra; + + ASSERT(tmp <= (uint64_t)(LLONG_MAX - res.value)); + + res.value += tmp; + + tmp = arg2_int * arg1_fra; + + ASSERT(tmp <= (uint64_t)(LLONG_MAX - res.value)); + + res.value += tmp; + + tmp = arg1_fra * arg2_fra; + + tmp = (tmp >> FIXED31_32_BITS_PER_FRACTIONAL_PART) + + (tmp >= (uint64_t)dal_fixed31_32_half.value); + + ASSERT(tmp <= (uint64_t)(LLONG_MAX - res.value)); + + res.value += tmp; + + if (arg1_negative ^ arg2_negative) + res.value = -res.value; + + return res; +} + +struct fixed31_32 dal_fixed31_32_sqr( + struct fixed31_32 arg) +{ + struct fixed31_32 res; + + uint64_t arg_value = abs_i64(arg.value); + + uint64_t arg_int = GET_INTEGER_PART(arg_value); + + uint64_t arg_fra = GET_FRACTIONAL_PART(arg_value); + + uint64_t tmp; + + res.value = arg_int * arg_int; + + ASSERT(res.value <= LONG_MAX); + + res.value <<= FIXED31_32_BITS_PER_FRACTIONAL_PART; + + tmp = arg_int * arg_fra; + + ASSERT(tmp <= (uint64_t)(LLONG_MAX - res.value)); + + res.value += tmp; + + ASSERT(tmp <= (uint64_t)(LLONG_MAX - res.value)); + + res.value += tmp; + + tmp = arg_fra * arg_fra; + + tmp = (tmp >> FIXED31_32_BITS_PER_FRACTIONAL_PART) + + (tmp >= (uint64_t)dal_fixed31_32_half.value); + + ASSERT(tmp <= (uint64_t)(LLONG_MAX - res.value)); + + res.value += tmp; + + return res; +} + +struct fixed31_32 dal_fixed31_32_recip( + struct fixed31_32 arg) +{ + /* + * @note + * Good idea to use Newton's method + */ + + ASSERT(arg.value); + + return dal_fixed31_32_from_fraction( + dal_fixed31_32_one.value, + arg.value); +} + +struct fixed31_32 dal_fixed31_32_sinc( + struct fixed31_32 arg) +{ + struct fixed31_32 square; + + struct fixed31_32 res = dal_fixed31_32_one; + + int32_t n = 27; + + struct fixed31_32 arg_norm = arg; + + if (dal_fixed31_32_le( + dal_fixed31_32_two_pi, + dal_fixed31_32_abs(arg))) { + arg_norm = dal_fixed31_32_sub( + arg_norm, + dal_fixed31_32_mul_int( + dal_fixed31_32_two_pi, + (int32_t)div64_s64( + arg_norm.value, + dal_fixed31_32_two_pi.value))); + } + + square = dal_fixed31_32_sqr(arg_norm); + + do { + res = dal_fixed31_32_sub( + dal_fixed31_32_one, + dal_fixed31_32_div_int( + dal_fixed31_32_mul( + square, + res), + n * (n - 1))); + + n -= 2; + } while (n > 2); + + if (arg.value != arg_norm.value) + res = dal_fixed31_32_div( + dal_fixed31_32_mul(res, arg_norm), + arg); + + return res; +} + +struct fixed31_32 dal_fixed31_32_sin( + struct fixed31_32 arg) +{ + return dal_fixed31_32_mul( + arg, + dal_fixed31_32_sinc(arg)); +} + +struct fixed31_32 dal_fixed31_32_cos( + struct fixed31_32 arg) +{ + /* TODO implement argument normalization */ + + const struct fixed31_32 square = dal_fixed31_32_sqr(arg); + + struct fixed31_32 res = dal_fixed31_32_one; + + int32_t n = 26; + + do { + res = dal_fixed31_32_sub( + dal_fixed31_32_one, + dal_fixed31_32_div_int( + dal_fixed31_32_mul( + square, + res), + n * (n - 1))); + + n -= 2; + } while (n != 0); + + return res; +} + +/* + * @brief + * result = exp(arg), + * where abs(arg) < 1 + * + * Calculated as Taylor series. + */ +static struct fixed31_32 fixed31_32_exp_from_taylor_series( + struct fixed31_32 arg) +{ + uint32_t n = 9; + + struct fixed31_32 res = dal_fixed31_32_from_fraction( + n + 2, + n + 1); + /* TODO find correct res */ + + ASSERT(dal_fixed31_32_lt(arg, dal_fixed31_32_one)); + + do + res = dal_fixed31_32_add( + dal_fixed31_32_one, + dal_fixed31_32_div_int( + dal_fixed31_32_mul( + arg, + res), + n)); + while (--n != 1); + + return dal_fixed31_32_add( + dal_fixed31_32_one, + dal_fixed31_32_mul( + arg, + res)); +} + +struct fixed31_32 dal_fixed31_32_exp( + struct fixed31_32 arg) +{ + /* + * @brief + * Main equation is: + * exp(x) = exp(r + m * ln(2)) = (1 << m) * exp(r), + * where m = round(x / ln(2)), r = x - m * ln(2) + */ + + if (dal_fixed31_32_le( + dal_fixed31_32_ln2_div_2, + dal_fixed31_32_abs(arg))) { + int32_t m = dal_fixed31_32_round( + dal_fixed31_32_div( + arg, + dal_fixed31_32_ln2)); + + struct fixed31_32 r = dal_fixed31_32_sub( + arg, + dal_fixed31_32_mul_int( + dal_fixed31_32_ln2, + m)); + + ASSERT(m != 0); + + ASSERT(dal_fixed31_32_lt( + dal_fixed31_32_abs(r), + dal_fixed31_32_one)); + + if (m > 0) + return dal_fixed31_32_shl( + fixed31_32_exp_from_taylor_series(r), + (uint8_t)m); + else + return dal_fixed31_32_div_int( + fixed31_32_exp_from_taylor_series(r), + 1LL << -m); + } else if (arg.value != 0) + return fixed31_32_exp_from_taylor_series(arg); + else + return dal_fixed31_32_one; +} + +struct fixed31_32 dal_fixed31_32_log( + struct fixed31_32 arg) +{ + struct fixed31_32 res = dal_fixed31_32_neg(dal_fixed31_32_one); + /* TODO improve 1st estimation */ + + struct fixed31_32 error; + + ASSERT(arg.value > 0); + /* TODO if arg is negative, return NaN */ + /* TODO if arg is zero, return -INF */ + + do { + struct fixed31_32 res1 = dal_fixed31_32_add( + dal_fixed31_32_sub( + res, + dal_fixed31_32_one), + dal_fixed31_32_div( + arg, + dal_fixed31_32_exp(res))); + + error = dal_fixed31_32_sub( + res, + res1); + + res = res1; + /* TODO determine max_allowed_error based on quality of exp() */ + } while (abs_i64(error.value) > 100ULL); + + return res; +} + +struct fixed31_32 dal_fixed31_32_pow( + struct fixed31_32 arg1, + struct fixed31_32 arg2) +{ + return dal_fixed31_32_exp( + dal_fixed31_32_mul( + dal_fixed31_32_log(arg1), + arg2)); +} + +int32_t dal_fixed31_32_floor( + struct fixed31_32 arg) +{ + uint64_t arg_value = abs_i64(arg.value); + + if (arg.value >= 0) + return (int32_t)GET_INTEGER_PART(arg_value); + else + return -(int32_t)GET_INTEGER_PART(arg_value); +} + +int32_t dal_fixed31_32_round( + struct fixed31_32 arg) +{ + uint64_t arg_value = abs_i64(arg.value); + + const int64_t summand = dal_fixed31_32_half.value; + + ASSERT(LLONG_MAX - (int64_t)arg_value >= summand); + + arg_value += summand; + + if (arg.value >= 0) + return (int32_t)GET_INTEGER_PART(arg_value); + else + return -(int32_t)GET_INTEGER_PART(arg_value); +} + +int32_t dal_fixed31_32_ceil( + struct fixed31_32 arg) +{ + uint64_t arg_value = abs_i64(arg.value); + + const int64_t summand = dal_fixed31_32_one.value - + dal_fixed31_32_epsilon.value; + + ASSERT(LLONG_MAX - (int64_t)arg_value >= summand); + + arg_value += summand; + + if (arg.value >= 0) + return (int32_t)GET_INTEGER_PART(arg_value); + else + return -(int32_t)GET_INTEGER_PART(arg_value); +} + +/* this function is a generic helper to translate fixed point value to + * specified integer format that will consist of integer_bits integer part and + * fractional_bits fractional part. For example it is used in + * dal_fixed31_32_u2d19 to receive 2 bits integer part and 19 bits fractional + * part in 32 bits. It is used in hw programming (scaler) + */ + +static inline uint32_t ux_dy( + int64_t value, + uint32_t integer_bits, + uint32_t fractional_bits) +{ + /* 1. create mask of integer part */ + uint32_t result = (1 << integer_bits) - 1; + /* 2. mask out fractional part */ + uint32_t fractional_part = FRACTIONAL_PART_MASK & value; + /* 3. shrink fixed point integer part to be of integer_bits width*/ + result &= GET_INTEGER_PART(value); + /* 4. make space for fractional part to be filled in after integer */ + result <<= fractional_bits; + /* 5. shrink fixed point fractional part to of fractional_bits width*/ + fractional_part >>= FIXED31_32_BITS_PER_FRACTIONAL_PART - fractional_bits; + /* 6. merge the result */ + return result | fractional_part; +} + +uint32_t dal_fixed31_32_u2d19( + struct fixed31_32 arg) +{ + return ux_dy(arg.value, 2, 19); +} + +uint32_t dal_fixed31_32_u0d19( + struct fixed31_32 arg) +{ + return ux_dy(arg.value, 0, 19); +} |