summaryrefslogtreecommitdiffstats
path: root/arch/powerpc/lib/copypage_64.S
Commit message (Collapse)AuthorAgeFilesLines
* treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152Thomas Gleixner2019-05-301-5/+1Star
| | | | | | | | | | | | | | | | | | | | | Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* powerpc: clean inclusions of asm/feature-fixups.hChristophe Leroy2018-07-301-0/+1
| | | | | | | | files not using feature fixup don't need asm/feature-fixups.h files using feature fixup need asm/feature-fixups.h Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/64s: Set assembler machine type to POWER4Nicholas Piggin2018-03-311-0/+2
| | | | | | | | | | | | | | Rather than override the machine type in .S code (which can hide wrong or ambiguous code generation for the target), set the type to power4 for all assembly. This also means we need to be careful not to build power4-only code when we're not building for Book3S, such as the "power7" versions of copyuser/page/memcpy. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fix Book3E build, don't build the "power7" variants for non-Book3S] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* powerpc/64: Fix naming of cache block vs. cache lineBenjamin Herrenschmidt2017-02-061-2/+2
| | | | | | | | | | | | | In a number of places we called "cache line size" what is actually the cache block size, which in the powerpc architecture, means the effective size to use with cache management instructions (it can be different from the actual cache line size). We fix the naming across the board and properly retrieve both pieces of information when available in the device-tree. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
* ppc: move exports to definitionsAl Viro2016-08-081-0/+2
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* powerpc: Exported functions __clear_user and copy_page use r2 so need ↵Anton Blanchard2014-06-051-1/+1
| | | | | | | | | | | _GLOBAL_TOC() __clear_user and copy_page load from the TOC and are also exported to modules. This means we have to use _GLOBAL_TOC() so that we create the global entry point that sets up the TOC. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: No need to use dot symbols when branching to a functionAnton Blanchard2014-04-231-1/+1
| | | | | | | | | binutils is smart enough to know that a branch to a function descriptor is actually a branch to the functions text address. Alan tells me that binutils has been doing this for 9 years. Signed-off-by: Anton Blanchard <anton@samba.org>
* powerpc: POWER7 optimised copy_page using VMX and enhanced prefetchAnton Blanchard2012-07-031-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | Implement a POWER7 optimised copy_page using VMX and enhanced prefetch instructions. We use enhanced prefetch hints to prefetch both the load and store side. We copy a cacheline at a time and fall back to regular loads and stores if we are unable to use VMX (eg we are in an interrupt). The following microbenchmark was used to assess the impact of the patch: http://ozlabs.org/~anton/junkcode/page_fault_file.c We test MAP_PRIVATE page faults across a 1GB file, 100 times: # time ./page_fault_file -p -l 1G -i 100 Before: 22.25s After: 18.89s 17% faster Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Simplify 4k/64k copy_page logicAnton Blanchard2011-05-191-3/+4
| | | | | | | | To make it easier to add optimised versions of copy_page, remove the 4kB loop for 64kB pages and just do all the work in copy_page. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: Pair loads and stores in copy_4k_pageAnton Blanchard2010-02-171-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A number of our chips like loads and stores to be paired. A small kernel module testcase shows the improvement of pairing loads and stores in copy_4k_page: POWER6: +9% POWER7: +1.5% #include <linux/module.h> #include <linux/mm.h> #define ITERATIONS 10000000 static int __init copypage_init(void) { struct timespec before, after; unsigned long i; struct page *destpage, *srcpage; char *dest, *src; destpage = alloc_page(GFP_KERNEL); srcpage = alloc_page(GFP_KERNEL); dest = page_address(destpage); src = page_address(srcpage); getnstimeofday(&before); for (i = 0; i < ITERATIONS; i++) copy_4K_page(dest, src); getnstimeofday(&after); free_page((unsigned long)dest); free_page((unsigned long)src); printk(KERN_DEBUG "copy_4K_page loop took %lu ns\n", (after.tv_sec - before.tv_sec) * NSEC_PER_SEC + (after.tv_nsec - before.tv_nsec)); return 0; } static void __exit copypage_exit(void) { } module_init(copypage_init) module_exit(copypage_exit) MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anton Blanchard"); Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* powerpc: perf_event: Cleanup copy_page output by hiding setup symbolAnton Blanchard2009-10-281-2/+2
| | | | | | | | A lot of hits in "setup" doesn't make much sense, so hide this symbol and allow all the hits to end up in copy_4k_page. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* powerpc: New copy_4K_page()Mark Nelson2008-09-151-105/+93Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This new copy_4K_page() function was originally tuned for the best performance on the Cell processor, but after testing on more 64bit powerpc chips it was found that with a small modification it either matched the performance offered by the current mainline version or bettered it by a small amount. It was found that on a Cell-based QS22 blade the amount of system time measured when compiling a 2.6.26 pseries_defconfig decreased by 4%. Using the same test, a 4-way 970MP machine saw a decrease of 2% in system time. No noticeable change was seen on Power4, Power5 or Power6. The 4096 byte page is copied in thirty-two 128 byte strides. An initial setup loop executes dcbt instructions for the whole source page and dcbz instructions for the whole destination page. To do this, the cache line size is retrieved from ppc64_caches. A new CPU feature bit, CPU_FTR_CP_USE_DCBTZ, (introduced in the previous patch) is used to make the modification to this new copy routine - on Power4, 970 and Cell the feature bit is set so the setup loop is executed, but on all other 64bit chips the setup loop is nop'ed out. Signed-off-by: Mark Nelson <markn@au1.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
* [PATCH] powerpc: trivial: modify comments to refer to new location of filesJon Mason2006-02-101-2/+0Star
| | | | | | | | | This patch removes all self references and fixes references to files in the now defunct arch/ppc64 tree. I think this accomplises everything wanted, though there might be a few references I missed. Signed-off-by: Jon Mason <jdmason@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
* [PATCH] ppc64: support 64k pagesBenjamin Herrenschmidt2005-11-071-1/+1
| | | | | | | | | | | | | | | Adds a new CONFIG_PPC_64K_PAGES which, when enabled, changes the kernel base page size to 64K. The resulting kernel still boots on any hardware. On current machines with 4K pages support only, the kernel will maintain 16 "subpages" for each 64K page transparently. Note that while real 64K capable HW has been tested, the current patch will not enable it yet as such hardware is not released yet, and I'm still verifying with the firmware architects the proper to get the information from the newer hypervisors. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* powerpc: Rename files to have consistent _32/_64 suffixesPaul Mackerras2005-10-101-0/+121
This doesn't change any code, just renames things so we consistently have foo_32.c and foo_64.c where we have separate 32- and 64-bit versions. Signed-off-by: Paul Mackerras <paulus@samba.org>