openslx/kernel-qcow2-linux.git - In-kernel qcow2 (Kernel part)

	Commit message (Collapse)	Author	Age	Files	Lines
*	i2c: core: Export bus recovery functions	Mark Brown	2015-04-15	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current -next fails to link an ARM allmodconfig because drivers that use the core recovery functions can be built as modules but those functions are not exported: ERROR: "i2c_generic_gpio_recovery" [drivers/i2c/busses/i2c-davinci.ko] undefined! ERROR: "i2c_generic_scl_recovery" [drivers/i2c/busses/i2c-davinci.ko] undefined! ERROR: "i2c_recover_bus" [drivers/i2c/busses/i2c-davinci.ko] undefined! Add exports to fix this. Fixes: 5f9296ba21b3c (i2c: Add bus recovery infrastructure) Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
*	i2c: jz4780: Fix build for m68k and sparc64	Guenter Roeck	2015-04-15	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix: drivers/i2c/busses/i2c-jz4780.c: In function 'jz4780_i2c_readw': drivers/i2c/busses/i2c-jz4780.c:181:2: error: implicit declaration of function 'readw' drivers/i2c/busses/i2c-jz4780.c: In function 'jz4780_i2c_writew': drivers/i2c/busses/i2c-jz4780.c:187:2: error: implicit declaration of function 'writew' seen with sparc64:allmodconfig and m68k:allmodconfig. The driver has to include linux/io.h. Fixes: ba92222ed63a ("i2c: jz4780: Add i2c bus controller driver for Ingenic JZ4780") Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
*	Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm	Linus Torvalds	2015-04-15	93	-604/+2429
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull ARM updates from Russell King: "Included in this update are both some long term fixes and some new features. Fixes: - An integer overflow in the calculation of ELF_ET_DYN_BASE. - Avoiding OOMs for high-order IOMMU allocations - SMP requires the data cache to be enabled for synchronisation primitives to work, so prevent the CPU_DCACHE_DISABLE option being visible on SMP builds. - A bug going back 10+ years in the noMMU ARM94* CPU support code, where it corrupts registers. Found by folk getting Linux running on their cameras. - Versatile Express needs an errata workaround enabled for CPU hot-unplug to work. Features: - Clean up module linker by handling out of range relocations separately from relocation cases we don't handle. - Fix a long term bug in the pci_mmap_page_range() code, which we hope won't impact userspace (we hope there's no users of the existing broken interface.) - Don't map DMA coherent allocations when we don't have a MMU. - Drop experimental status for SMP_ON_UP. - Warn when DT doesn't specify ePAPR mandatory cache properties. - Add documentation concerning how we find the start of physical memory for AUTO_ZRELADDR kernels, detailing why we have chosen the mask and the implications of changing it. - Updates from Ard Biesheuvel to address some issues with large kernels (such as allyesconfig) failing to link. - Allow hibernation to work on modern (ARMv7) CPUs - this appears to have never worked in the past on these CPUs. - Enable IRQ_SHOW_LEVEL, which changes the /proc/interrupts output format (hopefully without userspace breaking... let's hope that if it causes someone a problem, they tell us.) - Fix tegra-ahb DT offsets. - Rework ARM errata 643719 code (and ARMv7 flush_cache_louis()/ flush_dcache_all()) code to be more efficient, and enable this errata workaround by default for ARMv7+SMP CPUs. This complements the Versatile Express fix above. - Rework ARMv7 context code for errata 430973, so that only Cortex A8 CPUs are impacted by the branch target buffer flush when this errata is enabled. Also update the help text to indicate that all r1p* A8 CPUs are impacted. - Switch ARM to the generic show_mem() implementation, it conveys all the information which we were already reporting. - Prevent slow timer sources being used for udelay() - timers running at less than 1MHz are not useful for this, and can cause udelay() to return immediately, without any wait. Using such a slow timer is silly. - VDSO support for 32-bit ARM, mainly for gettimeofday() using the ARM architected timer. - Perf support for Scorpion performance monitoring units" vdso semantic conflict fixed up as per linux-next. * 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (52 commits) ARM: update errata 430973 documentation to cover Cortex A8 r1p* ARM: ensure delay timer has sufficient accuracy for delays ARM: switch to use the generic show_mem() implementation ARM: proc-v7: avoid errata 430973 workaround for non-Cortex A8 CPUs ARM: enable ARM errata 643719 workaround by default ARM: cache-v7: optimise test for Cortex A9 r0pX devices ARM: cache-v7: optimise branches in v7_flush_cache_louis ARM: cache-v7: consolidate initialisation of cache level index ARM: cache-v7: shift CLIDR to extract appropriate field before masking ARM: cache-v7: use movw/movt instructions ARM: allow 16-bit instructions in ALT_UP() ARM: proc-arm94*.S: fix setup function ARM: vexpress: fix CPU hotplug with CT9x4 tile. ARM: 8276/1: Make CPU_DCACHE_DISABLE depend on !SMP ARM: 8335/1: Documentation: DT bindings: Tegra AHB: document the legacy base address ARM: 8334/1: amba: tegra-ahb: detect and correct bogus base address ARM: 8333/1: amba: tegra-ahb: fix register offsets in the macros ARM: 8339/1: Enable CONFIG_GENERIC_IRQ_SHOW_LEVEL ARM: 8338/1: kexec: Relax SMP validation to improve DT compatibility ARM: 8337/1: mm: Do not invoke OOM for higher order IOMMU DMA allocations ...
\| *	Merge branch 'devel-stable' into for-next	Russell King	2015-04-14	721	-4085/+8197
\| \|\
\| \| *	ARM: pmu: add support for interrupt-affinity property	Will Deacon	2015-03-24	2	-7/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Historically, the PMU devicetree bindings have expected SPIs to be listed in order of logical CPU number. This is problematic for bootloaders, especially when the boot CPU (logical ID 0) isn't listed first in the devicetree. This patch adds a new optional property, interrupt-affinity, to the PMU node which allows the interrupt affinity to be described using a list of phandled to CPU nodes, with each entry in the list corresponding to the SPI at the same index in the interrupts property. Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
\| \| *	ARM: perf: reject groups spanning multiple hardware PMUs	Suzuki K. Poulose	2015-03-19	1	-6/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The perf core implicitly rejects events spanning multiple HW PMUs, as in these cases the event->ctx will differ. However this validation is performed after pmu::event_init() is called in perf_init_event(), and thus pmu::event_init() may be called with a group leader from a different HW PMU. The ARM PMU driver does not take this fact into account, and when validating groups assumes that it can call to_arm_pmu(event->pmu) for any HW event. When the event in question is from another HW PMU this is wrong, and results in dereferencing garbage. This patch updates the ARM PMU driver to first test for and reject events from other PMUs, moving the to_arm_pmu and related logic after this test. Fixes a crash triggered by perf_fuzzer on Linux-4.0-rc2, with a CCI PMU present: --- CPU: 0 PID: 1527 Comm: perf_fuzzer Not tainted 4.0.0-rc2 #57 Hardware name: ARM-Versatile Express task: bd8484c0 ti: be676000 task.ti: be676000 PC is at 0xbf1bbc90 LR is at validate_event+0x34/0x5c pc : [<bf1bbc90>] lr : [<80016060>] psr: 00000013 ... [<80016060>] (validate_event) from [<80016198>] (validate_group+0x28/0x90) [<80016198>] (validate_group) from [<80016398>] (armpmu_event_init+0x150/0x218) [<80016398>] (armpmu_event_init) from [<800882e4>] (perf_try_init_event+0x30/0x48) [<800882e4>] (perf_try_init_event) from [<8008f544>] (perf_init_event+0x5c/0xf4) [<8008f544>] (perf_init_event) from [<8008f8a8>] (perf_event_alloc+0x2cc/0x35c) [<8008f8a8>] (perf_event_alloc) from [<8009015c>] (SyS_perf_event_open+0x498/0xa70) [<8009015c>] (SyS_perf_event_open) from [<8000e420>] (ret_fast_syscall+0x0/0x34) Code: bf1be000 bf1bb380 802a2664 00000000 (00000002) ---[ end trace 01aff0ff00926a0a ]--- Also cleans up the code to use the arm_pmu only when we know that we are dealing with an arm pmu event. Cc: Will Deacon <will.deacon@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Peter Ziljstra (Intel) <peterz@infradead.org> Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
\| \| *	ARM: perf: Add support for Scorpion PMUs	Stephen Boyd	2015-03-17	3	-0/+418
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Scorpion supports a set of local performance monitor event selection registers (LPM) sitting behind a cp15 based interface that extend the architected PMU events to include Scorpion CPU and Venum VFP specific events. To use these events the user is expected to program the lpm register with the event code shifted into the group they care about and then point the PMNx event at that region+group combo by writing a LPMn_GROUPx event. Add support for this hardware. Note: the raw event number is a pure software construct that allows us to map the multi-dimensional number space of regions, groups, and event codes into a flat event number space suitable for use by the perf framework. This is based on code originally written by Sheetal Sahasrabudhe, Ashwin Chaugule, and Neil Leeder [1]. [1] https://www.codeaurora.org/cgit/quic/la/kernel/msm/tree/arch/arm/kernel/perf_event_msm.c?h=msm-3.4 Cc: Mark Rutland <mark.rutland@arm.com> Cc: Neil Leeder <nleeder@codeaurora.org> Cc: Ashwin Chaugule <ashwinc@codeaurora.org> Cc: Sheetal Sahasrabudhe <sheetals@codeaurora.org> Cc: <devicetree@vger.kernel.org> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
\| \| *	ARM: perf: Only reset PMxEVCNTCR registers on reset	Stephen Boyd	2015-03-17	1	-2/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Krait specific PMxEVCNTCR register is unpredictable upon reset. Currently we clear the register before we setup an event, but we don't need to do that. Instead, we can iterate through all the events and clear them once when we reset the PMU, saving a write in the event setup path. Cc: Neil Leeder <nleeder@codeaurora.org> Cc: Ashwin Chaugule <ashwinc@codeaurora.org> Cc: Sheetal Sahasrabudhe <sheetals@codeaurora.org> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
\| \| *	ARM: perf: Preparatory work for Scorpion PMU support	Stephen Boyd	2015-03-17	1	-57/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Do some things to make the Krait PMU support code generic enough to be used by the Scorpion PMU support code. * Rename the venum register functions to be venum instead of krait specific because the same registers exist on Scorpion * Add some macros to decode our Krait specific event encoding that's the same on Scorpion (modulo an extra region). * Drop 'krait' from krait_clear_pmresrn_group() so it can be used by Scorpion code Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
\| \| \|
\| \| \
\| *-. \	Merge branches 'misc', 'vdso' and 'fixes' into for-next	Russell King	2015-04-14	30	-69/+1271
\| \|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: arch/arm/mm/proc-macros.S
\| \| \| * \|	ARM: proc-arm94*.S: fix setup function	Russell King	2015-04-10	3	-34/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Both ARM946 and ARM940 setup functions were corrupting r1 and r2, which is not permissible - these are used to carry the machine ID and boot data into the kernel, and must be preserved. The code responsible for this was the same in both files: they were using the registers to generate a protection region register value. Fix this by turning this process into a macro, and using that macro in both these files with an alternative register allocation. r0, r3 and r7 can be used for temporary values here. Reported-by: Alex Dumitrache <broscutamaker@gmail.com> Tested-by: Georg Hofstetter <g3gg0.de@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| \| \| * \|	ARM: vexpress: fix CPU hotplug with CT9x4 tile.	Russell King	2015-04-07	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Cortex A9 tile fails to unplug CPUs if errata 643719 is not enabled. This leads to random weird behaviours, but ultimately seem to lock the kernel one way or another when a CPU is hot unplugged. Symptoms range from a spinlock lockup in the scheduler, the entire system hanging, to dumping out the kernel printk buffer a few lines at a time, and other weird behaviours. This is caused by the outgoing CPU not having its inner caches properly flushed before it exits coherency - flush_cache_louis() is used to achieve this, but as a result of the hardware bug, this function ends up doing nothing without the errata workaround enabled. As the Versatile Express has an affected CPU, this errata must always be enabled. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| \| \| * \|	ARM: 8276/1: Make CPU_DCACHE_DISABLE depend on !SMP	Florian Fainelli	2015-04-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Enabling CPU_DCACHE_DISABLE on a SMP capable system will prevent the kernel from booting because of the following ldrex instruction in arch_spin_lock: (gdb) x/10i $pc => 0xc053cfa8 <_raw_spin_lock+4>: ldrex r3, [r0] 0xc053cfac <_raw_spin_lock+8>: add r2, r3, #65536 ; 0x10000 which is taken by the very first printk call: at /home/fainelli/work/linux/arch/arm/include/asm/spinlock.h:65 fmt=0xc0637650 " 01 66Booting Linux on physical CPU 0x%xn", args=<incomplete type>) at kernel/printk/printk.c:1525 fmt=0xc05370f4 <printk+52> " 24320215342 04340235344 20320215342 36377/341 17") at kernel/printk/printk.c:1688 ldrex requires exclusive monitor(s) (local or global) which are no longer working when the Data cache is disabled in CP15 and will just hang the CPU there. Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| \| \| * \|	ARM: 8337/1: mm: Do not invoke OOM for higher order IOMMU DMA allocations	Tomasz Figa	2015-04-02	1	-6/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	IOMMU should be able to use single pages as well as bigger blocks, so if higher order allocations fail, we should not affect state of the system, with events such as OOM killer, but rather fall back to order 0 allocations. This patch changes the behavior of ARM IOMMU DMA allocator to use __GFP_NORETRY, which bypasses OOM invocation, for orders higher than zero and, only if that fails, fall back to normal order 0 allocation which might invoke OOM killer. Signed-off-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Doug Anderson <dianders@chromium.org> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| \| \| * \|	ARM: 8320/1: fix integer overflow in ELF_ET_DYN_BASE	Andrey Ryabinin	2015-03-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Usually ELF_ET_DYN_BASE is 2/3 of TASK_SIZE. With 3G/1G user/kernel split this is not so, because 2TASK_SIZE overflows 32 bits, so the actual value of ELF_ET_DYN_BASE is: (2 TASK_SIZE / 3) = 0x2a000000 When ASLR is disabled PIE binaries will load at ELF_ET_DYN_BASE address. On 32bit platforms AddressSanitzer uses addresses [0x20000000 - 0x40000000] for shadow memory [1]. So ASan doesn't work for PIE binaries when ASLR disabled as it fails to map shadow memory. Also after Kees's 'split ET_DYN ASLR from mmap ASLR' patchset PIE binaries has a high chance of loading somewhere in between [0x2a000000 - 0x40000000] even if ASLR enabled. This makes ASan with PIE absolutely incompatible. Fix overflow by dividing TASK_SIZE prior to multiplying. After this patch ELF_ET_DYN_BASE equals to (for CONFIG_VMSPLIT_3G=y): (TASK_SIZE / 3 * 2) = 0x7f555554 [1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerAlgorithm#Mapping Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com> Reported-by: Maria Guseva <m.guseva@samsung.com> Cc: stable@vger.kernel.org Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| \| * \| \|	ARM: 8332/1: add CONFIG_VDSO Kconfig and Makefile bits	Nathan Lynch	2015-03-27	3	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allow users to enable the vdso in Kconfig; include the vdso in the build if CONFIG_VDSO is enabled. Add 'vdso_install' target. Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| \| * \| \|	ARM: 8331/1: VDSO initialization, mapping, and synchronization	Nathan Lynch	2015-03-27	2	-3/+351
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Initialize the VDSO page list at boot, install the VDSO mapping at exec time, and update the data page during timer ticks. This code is not built if CONFIG_VDSO is not enabled. Account for the VDSO length when randomizing the offset from the stack. The [vdso] and [vvar] pages are placed immediately following the sigpage with separate _install_special_mapping calls. We want to "penalize" systems lacking the arch timer as little as possible. Previous versions of this code installed the VDSO unconditionally and unmodified, making it a measurably slower way for glibc to invoke the real syscalls on such systems. E.g. calling gettimeofday via glibc goes from ~560ns to ~630ns on i.MX6Q. If we can indicate to glibc that the time-related APIs in the VDSO are not accelerated, glibc can continue to invoke the syscalls directly instead of dispatching through the VDSO only to fall back to the slow path. Thus, if the architected timer is unusable for whatever reason, patch the VDSO at boot time so that symbol lookups for gettimeofday and clock_gettime return NULL. (This is similar to what powerpc does and borrows code from there.) This allows glibc to perform the syscall directly instead of passing control to the VDSO, which minimizes the penalty. In my measurements the time taken for a gettimeofday call via glibc goes from ~560ns to ~580ns (again on i.MX6Q), and this is solely due to adding a test and branch to glibc's gettimeofday syscall wrapper. An alternative to patching the VDSO at boot would be to not install the VDSO at all when the arch timer isn't usable. Another alternative is to include a separate "dummy" vdso.so without gettimeofday and clock_gettime, which would be selected at boot time. Either of these would get cumbersome if the VDSO were to gain support for an API such as getcpu which is unrelated to arch timer support. Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| \| * \| \|	ARM: 8330/1: add VDSO user-space code	Nathan Lynch	2015-03-27	8	-0/+700
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Place VDSO-related user-space code in arch/arm/kernel/vdso/. It is almost completely written in C with some assembly helpers to load the data page address, sample the counter, and fall back to system calls when necessary. The VDSO can service gettimeofday and clock_gettime when CONFIG_ARM_ARCH_TIMER is enabled and the architected timer is present (and correctly configured). It reads the CP15-based virtual counter to compute high-resolution timestamps. Of particular note is that a post-processing step ("vdsomunge") is necessary to produce a shared object which is architecturally allowed to be used by both soft- and hard-float EABI programs. The 2012 edition of the ARM ABI defines Tag_ABI_VFP_args = 3 "Code is compatible with both the base and VFP variants; the user did not permit non-variadic functions to pass FP parameters/results." Unfortunately current toolchains do not support this tag, which is ideally what we would use. The best available option is to ensure that both EF_ARM_ABI_FLOAT_SOFT and EF_ARM_ABI_FLOAT_HARD are unset in the ELF header's e_flags, indicating that the shared object is "old" and should be accepted for backward compatibility's sake. While binutils < 2.24 appear to produce a vdso.so with both flags clear, 2.24 always sets EF_ARM_ABI_FLOAT_SOFT, with no way to inhibit this behavior. So we have to fix things up with a custom post-processing step. In fact, the VDSO code in glibc does much less validation (including checking these flags) than the code for handling conventional file-backed shared libraries, so this is a bit moot unless glibc's VDSO code becomes more strict. Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| \| * \| \|	ARM: 8329/1: miscellaneous vdso infrastructure, preparation	Nathan Lynch	2015-03-27	8	-1/+113
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Define the layout of the data structure shared between kernel and userspace. Track the vdso address in the mm_context; needed for communicating AT_SYSINFO_EHDR to the ELF loader. Add declarations for arm_install_vdso; implementation is in a following patch. Define AT_SYSINFO_EHDR, and, if CONFIG_VDSO=y, report the vdso shared object address via the ELF auxiliary vector. Note - this adds the AT_SYSINFO_EHDR in a new user-visible header asm/auxvec.h; this is consistent with other architectures. Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: update errata 430973 documentation to cover Cortex A8 r1p*	Russell King	2015-04-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This errata covers all r1 variants of Cortex A8, it's not limited to just r1p0..r1p2. Update the documentation to reflect this. The code already applies the workaround to all r1p* A8 CPUs. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: ensure delay timer has sufficient accuracy for delays	Russell King	2015-04-14	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have recently had an example of someone wanting to use a 90kHz timer for the software delay loop. udelay() needs to have at least microsecond resolution to allow drivers access to a delay mechanism with a reasonable chance of delaying the period they requested within at least a 50% marging of error, especially for small delays. Discussion about the udelay() accuracy can be found at: https://lkml.org/lkml/2011/1/9/37 Reject timers which are unable to supply this level of resolution. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: switch to use the generic show_mem() implementation	Russell King	2015-04-14	1	-49/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Switch ARM to use the generic show_mem() implementation, which displays the statistics from the mm zone rather than walking the page arrays. Acked-by: Mel Gorman <mgorman <mgorman@suse.de> Tested-by: Gregory Fong <gregory.0xf0@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: proc-v7: avoid errata 430973 workaround for non-Cortex A8 CPUs	Russell King	2015-04-14	2	-4/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Avoid the errata 430973 workaround for non-Cortex A8 CPUs. Having this workaround enabled introduces an additional branch target buffer flush into the context switching path, something we wish to avoid. To allow this errata to be enabled in multiplatform kernels while reducing its impact, rearrange the Cortex-A8 CPU support to avoid impacting on other Version 7 CPUs. Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: enable ARM errata 643719 workaround by default	Russell King	2015-04-14	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The effects of not having ARM errata 643719 enabled on affected CPUs can be very confusing and hard to debug. Rather than leave this to chance, enable this workaround by default. Now that we have rearranged the code, it should have a low impact on the majority of CPUs. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: cache-v7: optimise test for Cortex A9 r0pX devices	Russell King	2015-04-14	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Eliminate one unnecessary instruction from this test by pre-shifting the Cortex A9 ID - we can shift the actual ID in the teq instruction thereby losing the pX bit of the ID at no cost. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: cache-v7: optimise branches in v7_flush_cache_louis	Russell King	2015-04-14	1	-9/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Optimise the branches such that for the majority of unaffected devices, we avoid needing to execute the errata work-around code path by branching to start_flush_levels early. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: cache-v7: consolidate initialisation of cache level index	Russell King	2015-04-14	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Both v7_flush_cache_louis and v7_flush_dcache_all both begin the flush_levels loop with r10 initialised to zero. In each case, this is done immediately prior to entering the loop. Branch to this instruction in v7_flush_dcache_all from v7_flush_cache_louis and eliminate the unnecessary initialisation in v7_flush_cache_louis. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: cache-v7: shift CLIDR to extract appropriate field before masking	Russell King	2015-04-14	1	-7/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rather than have code which masks and then shifts, such as: mrc p15, 1, r0, c0, c0, 1 ALT_SMP(ands r3, r0, #7 << 21) ALT_UP( ands r3, r0, #7 << 27) ALT_SMP(mov r3, r3, lsr #20) ALT_UP( mov r3, r3, lsr #26) re-arrange this as a shift and then mask. The masking is the same for each field which we want to extract, so this allows the mask to be shared amongst code paths: mrc p15, 1, r0, c0, c0, 1 ALT_SMP(mov r3, r0, lsr #20) ALT_UP( mov r3, r0, lsr #26) ands r3, r3, #7 << 1 Use this method for the LoUIS, LoUU and LoC fields. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: cache-v7: use movw/movt instructions	Russell King	2015-04-14	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We always build cache-v7.S for ARMv7, so we can use the ARMv7 16-bit move instructions to load large constants, rather than using constants in a literal pool. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: allow 16-bit instructions in ALT_UP()	Russell King	2015-04-14	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allow ALT_UP() to cope with a 16-bit Thumb instruction by automatically inserting a following nop instruction. This allows us to care less about getting the assembler to emit a 32-bit thumb instruction. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: 8335/1: Documentation: DT bindings: Tegra AHB: document the legacy base ↵	Paul Walmsley	2015-04-02	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	address Documentation: DT bindings: Tegra AHB: require the legacy base address for existing chips Per Stephen Warren, note in the Tegra AHB DT binding documentation that we specifically deprecate any attempt to use the IP block's actual hardware base address, and advocate the use of the legacy "off-by-four" address in the 'regs' property, for Tegra chips with existing upstream Linux DT files that include a Tegra AHB node. This patch updates the documentation accordingly. Changing the existing kernel DT data isn't under consideration because Linux kernel DT data policy is to preserve compatibility between newer DT data files and older kernels. However, this additional step of changing the documentation should discourage others from sending kernel patches to try to change the legacy kernel DT data. Furthermore, for out-of-tree software (such as bootloaders or other operating systems) that may rely on Linux kernel DT binding documentation as an ABI (but not the Linux kernel DT data itself), such a change may allow future convergence with the Linux kernel DT data without additional code changes. Signed-off-by: Paul Walmsley <paul@pwsan.com> Cc: Paul Walmsley <pwalmsley@nvidia.com> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Alexandre Courbot <gnurou@gmail.com> Cc: Eduardo Valentin <edubezval@gmail.com> Cc: Ian Campbell <ijc+devicetree@hellion.org.uk> Cc: Kumar Gala <galak@codeaurora.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: devicetree@vger.kernel.org Cc: linux-kernel@vger.kernel.org Acked-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: 8334/1: amba: tegra-ahb: detect and correct bogus base address	Paul Walmsley	2015-04-02	1	-2/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	amba: tegra-ahb: detect and correct bogus base address From a hardware SoC integration point of view, the starting address of this IP block in the existing Tegra SoC DT files is off by 4 bytes from the actual base address. Since we attempt to make old DT files forward-compatible with newer kernels, we cannot fix the IP block base address in old DT data. This patch works around the problem by detecting the four byte base address offset in the driver code, and correcting it if it's detected. (In general, IP block base addresses almost always have a null low byte.) Future SoC DT data for Tegra AHB should use the correct Tegra AHB base address, in cases where there is no DT data backward compatibility requirement. This patch is a revision of the patch originally titled "amba: tegra-ahb: use correct base address for future chip support". This revision implements changes requested by Russell King: http://marc.info/?l=linux-tegra&m=142658851825062&w=2 http://marc.info/?l=linux-tegra&m=142658873925178&w=2 Signed-off-by: Paul Walmsley <paul@pwsan.com> Cc: Paul Walmsley <pwalmsley@nvidia.com> Cc: Alexandre Courbot <gnurou@gmail.com> Cc: Hiroshi DOYU <hdoyu@nvidia.com> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: linux-kernel@vger.kernel.org Acked-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: 8333/1: amba: tegra-ahb: fix register offsets in the macros	Paul Walmsley	2015-04-02	1	-31/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	amba: tegra-ahb: fix register offsets in the macros From a hardware SoC integration point of view, the offsets of the Tegra AHB registers that are currently defined in tegra-ahb.c macros are all off by four bytes. Similarly, the starting address of this IP block in our existing DT files is also off by four bytes. Since we attempt to make old DT files forward-compatible with newer kernels, we cannot fix the IP block base address in old DT data. However, we can fix the offsets in the driver so that they are correct with respect to the hardware, which is what this patch does. And a subsequent patch will allow the offset to be removed for DT 'compatible' strings used in future DT files for newer Tegra chips that the kernel does not yet support. Signed-off-by: Paul Walmsley <paul@pwsan.com> Cc: Paul Walmsley <pwalmsley@nvidia.com> Cc: Alexandre Courbot <gnurou@gmail.com> Cc: Hiroshi DOYU <hdoyu@nvidia.com> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: linux-kernel@vger.kernel.org Acked-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: 8339/1: Enable CONFIG_GENERIC_IRQ_SHOW_LEVEL	Geert Uytterhoeven	2015-04-02	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Several interrupt controllers support both edge and level interrupts, so it's useful to provide that information in /proc/interrupts. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: 8338/1: kexec: Relax SMP validation to improve DT compatibility	Geert Uytterhoeven	2015-04-02	3	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When trying to kexec into a new kernel on a platform where multiple CPU cores are present, but no SMP bringup code is available yet, the kexec_load system call fails with: kexec_load failed: Invalid argument The SMP test added to machine_kexec_prepare() in commit 2103f6cba61a8b8b ("ARM: 7807/1: kexec: validate CPU hotplug support") wants to prohibit kexec on SMP platforms where it cannot disable secondary CPUs. However, this test is too strict: if the secondary CPUs couldn't be enabled in the first place, there's no need to disable them later at kexec time. Hence skip the test in the absence of SMP bringup code. This allows to add all CPU cores to the DTS from the beginning, without having to implement SMP bringup first, improving DT compatibility. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: move reboot code to arch/arm/kernel/reboot.c	Russell King	2015-04-02	5	-150/+158
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move shutdown and reboot related code to a separate file, out of process.c. This helps to avoid polluting process.c with non-process related code. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: fix broken hibernation	Russell King	2015-04-02	3	-4/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Normally, when a CPU wants to clear a cache line to zero in the external L2 cache, it would generate bus cycles to write each word as it would do with any other data access. However, a Cortex A9 connected to a L2C-310 has a specific feature where the CPU can detect this operation, and signal that it wants to zero an entire cache line. This feature, known as Full Line of Zeros (FLZ), involves a non-standard AXI signalling mechanism which only the L2C-310 can properly interpret. There are separate enable bits in both the L2C-310 and the Cortex A9 - the L2C-310 needs to be enabled and have the FLZ enable bit set in the auxiliary control register before the Cortex A9 has this feature enabled. Unfortunately, the suspend code was not respecting this - it's not obvious from the code: swsusp_arch_suspend() cpu_suspend() /* saves the Cortex A9 auxiliary control register / arch_save_image() soft_restart() / turns off FLZ in Cortex A9, and disables L2C / cpu_resume() / restores the Cortex A9 registers, inc auxcr / At this point, we end up with the L2C disabled, but the Cortex A9 with FLZ enabled - which means any memset() or zeroing of a full cache line will fail to take effect. A similar issue exists in the resume path, but it's slightly more complex: swsusp_arch_suspend() cpu_suspend() / saves the Cortex A9 auxiliary control register / arch_save_image() / image with A9 auxcr saved / ... swsusp_arch_resume() call_with_stack() arch_restore_image() / restores image with A9 auxcr saved above / soft_restart() / turns off FLZ in Cortex A9, and disables L2C / cpu_resume() / restores the Cortex A9 registers, inc auxcr */ Again, here we end up with the L2C disabled, but Cortex A9 FLZ enabled. There's no need to turn off the L2C in either of these two paths; there are benefits from not doing so - for example, the page copies will be faster with the L2C enabled. Hence, fix this by providing a variant of soft_restart() which can be used without turning the L2 cache controller off, and use it in both of these paths to keep the L2C enabled across the respective resume transitions. Fixes: 8ef418c7178f ("ARM: l2c: trial at enabling some Cortex-A9 optimisations") Reported-by: Sean Cross <xobs@kosagi.com> Tested-by: Sean Cross <xobs@kosagi.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: 8326/1: s5pv210: move resume code to .text section	Ard Biesheuvel	2015-03-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This code calls cpu_resume() using a straight branch (b), so now that we have moved cpu_resume() back to .text, this should be moved there as well. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: 8325/1: exynos: move resume code to .text section	Ard Biesheuvel	2015-03-30	1	-15/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This code calls cpu_resume() using a straight branch (b), so now that we have moved cpu_resume() back to .text, this should be moved there as well. Any direct references to symbols that will remain in the .data section are replaced with explicit PC-relative references. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: 8324/1: move cpu_resume() to .text section	Ard Biesheuvel	2015-03-30	1	-9/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move cpu_resume() to the .text section where it belongs. Change the adr reference to sleep_save_sp to an explicit PC relative reference so sleep_save_sp itself can remain in .data. This helps prevent linker failure on large kernels, as the code in the .data section may be too far away to be in range for normal b/bl instructions. Reviewed-by: Nicolas Pitre <nico@linaro.org> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: 8323/1: force linker to use PIC veneers	Ard Biesheuvel	2015-03-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When building a very large kernel, it is up to the linker to decide when and where to insert stubs to allow calls to functions that are out of range for the ordinary b/bl instructions. However, since the kernel is built as a position dependent binary, these stubs (aka veneers) may contain absolute addresses, which will break far calls performed with the MMU off. For instance, the call from __enable_mmu() in the .head.text section to __turn_mmu_on() in the .idmap.text section may be turned into something like this: c0008168 <__enable_mmu>: c0008168: f020 0002 bic.w r0, r0, #2 c000816c: f420 5080 bic.w r0, r0, #4096 c0008170: f000 b846 b.w c0008200 <____turn_mmu_on_veneer> [...] c0008200 <____turn_mmu_on_veneer>: c0008200: 4778 bx pc c0008202: 46c0 nop c0008204: e59fc000 ldr ip, [pc] c0008208: e12fff1c bx ip c000820c: c13dfae1 teqgt sp, r1, ror #21 [...] c13dfae0 <__turn_mmu_on>: c13dfae0: 4600 mov r0, r0 [...] After adding --pic-veneer to the LDFLAGS, the veneer is emitted like this instead: c0008200 <____turn_mmu_on_veneer>: c0008200: 4778 bx pc c0008202: 46c0 nop c0008204: e59fc004 ldr ip, [pc, #4] c0008208: e08fc00c add ip, pc, ip c000820c: e12fff1c bx ip c0008210: 013d7d31 teqeq sp, r1, lsr sp c0008214: 00000000 andeq r0, r0, r0 Note that this particular example is best addressed by moving .head.text and .idmap.text closer together, but this issue could potentially affect any code that needs to execute with the MMU off. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: 8322/1: keep .text and .fixup regions closer together	Ard Biesheuvel	2015-03-30	11	-20/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This moves all fixup snippets to the .text.fixup section, which is a special section that gets emitted along with the .text section for each input object file, i.e., the snippets are kept much closer to the code they refer to, which helps prevent linker failure on large kernels. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: 8321/1: asm-generic: introduce .text.fixup input section	Ard Biesheuvel	2015-03-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This introduces a new .text.fixup input section that gets emitted together with the .text section for each input object file. Note that (.text) (.text.fixup) is not the same as *(.text .text.fixup) and we are looking for the latter, to ensure that fixup snippets that are assembled into a separate section in the object file do not end up out of range for the relative branch instructions it contains if the .text section itself grows very large. This helps prevent linker failures on large ARM kernels. Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: 8307/1: psci: move psci firmware calls out of line	Mark Rutland	2015-03-30	3	-37/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	arm64 builds with GCC 5 have caused the __asmeq assertions in the PSCI calling code to fire, so move the ARM PSCI calls out of line into their own assembly file for consistency and to safeguard against the same issue occuring with the 32-bit toolchain. [will: brought into line with arm64 implementation] Reported-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: 8328/1: remove empty preprocessor #else branch	Uwe Kleine-König	2015-03-28	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the patch for e16343c47e42 (ARM: 8160/1: drop warning about return_address not using unwind tables) was created there was still more code in said branch. Probably this simplification was just missed during conflict resolution when the patch was applied. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: 8327/1: zImage: add support for ARMv7-M	Joachim Eastwood	2015-03-28	2	-6/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes it possible to enter zImage in Thumb mode for ARMv7-M (Cortex-M) CPUs that do not support ARM mode. The kernel entry is also made in Thumb mode. [ukl: fix spelling in commit log, return early in call_cache_fn] Signed-off-by: Joachim Eastwood <manabian@gmail.com> Tested-by: Stefan Agner <stefan@agner.ch> Tested-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar> Tested-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: 8319/1: advertise availability of v8 Crypto instructions	Ard Biesheuvel	2015-03-28	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When running the 32-bit ARM kernel on ARMv8 capable bare metal (e.g., 32-bit Android userland and kernel on a Cortex-A53), or as a KVM guest on a 64-bit host, we should advertise the availability of the Crypto instructions, so that userland libraries such as OpenSSL may use them. (Support for the v8 Crypto instructions in the 32-bit build was added to OpenSSL more than six months ago) This adds the ID feature bit detection, and sets elf_hwcap2 accordingly. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: 8318/1: treat CPU feature register fields as signed quantities	Ard Biesheuvel	2015-03-28	2	-13/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The various CPU feature registers consist of 4-bit blocks that represent signed quantities, whose positive values represent incremental features, and whose negative values are reserved. To improve forward compatibility, update the feature detection code to take possible future higher values into account, but ignore negative values. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: 8317/1: move the .idmap.text section closer to .head.text	Ard Biesheuvel	2015-03-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This moves the .idmap.text section closer to .head.text, so that relative branches are less likely to go out of range if the kernel text gets bigger. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
\| * \| \| \|	ARM: 8314/1: replace PROCINFO embedded branch with relative offset	Ard Biesheuvel	2015-03-28	26	-67/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch replaces the 'branch to setup()' instructions embedded in the PROCINFO structs with the offset to that setup function relative to the base of the struct. This preserves the position independent nature of that field, but uses a data item rather than an instruction. This is mainly done to prevent linker failures on large kernels, where the setup function is out of reach for the branch. Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>