summaryrefslogtreecommitdiffstats
path: root/util/bufferiszero.c
Commit message (Collapse)AuthorAgeFilesLines
* cutils: Rewrite x86 buffer zero checkingRichard Henderson2016-09-141-75/+156
| | | | | | | | | | Handle alignment of buffers, so that the vector paths can be used more often. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1473800239-13841-1-git-send-email-rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* cutils: Add generic prefetchRichard Henderson2016-09-131-0/+5
| | | | | | | | | There's no real knowledge of the cacheline size, just prefetching one loop ahead. Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1472496380-19706-7-git-send-email-rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* cutils: Add SSE4 versionPaolo Bonzini2016-09-131-0/+10
| | | | Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* cutils: Add test for buffer_is_zeroRichard Henderson2016-09-131-0/+20
| | | | | | Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1472496380-19706-6-git-send-email-rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* cutils: Remove ppc buffer zero checkingRichard Henderson2016-09-131-25/+1Star
| | | | | | | | | | For ppc64le, gcc6 does extremely poorly with the Altivec code. Moreover, on POWER7 and POWER8, a hand-optimized Altivec version turns out to be no faster than the revised integer version, and therefore not worth the effort. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* cutils: Remove aarch64 buffer zero checkingRichard Henderson2016-09-131-15/+0Star
| | | | | | | | | The revised integer version is 4 times faster than the neon version on an AppliedMicro Mustang. Even with hand scheduling and additional unrolling I cannot make any neon version run as fast as the integer. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* cutils: Rearrange buffer_is_zero accelerationRichard Henderson2016-09-131-191/+157Star
| | | | | | | | | | Allow selection of several acceleration functions based on the size and alignment of the buffer. Do not require ifunc support for AVX2 acceleration. Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1472496380-19706-5-git-send-email-rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* cutils: Export only buffer_is_zeroRichard Henderson2016-09-131-4/+4
| | | | | | | | | | | | Since the two users don't make use of the returned offset, beyond ensuring that the entire buffer is zero, consider the can_use_buffer_find_nonzero_offset and buffer_find_nonzero_offset functions internal. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1472496380-19706-4-git-send-email-rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* cutils: Remove SPLAT macroRichard Henderson2016-09-131-4/+0Star
| | | | | | | | This is unused and complicates the vector interface. Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1472496380-19706-3-git-send-email-rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* cutils: Move buffer_is_zero and subroutines to a new fileRichard Henderson2016-09-131-0/+272
Signed-off-by: Richard Henderson <rth@twiddle.net> Message-Id: <1472496380-19706-2-git-send-email-rth@twiddle.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>