summaryrefslogtreecommitdiffstats
path: root/arch/powerpc
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'merge' of ↵Linus Torvalds2008-07-3043-62/+757
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/mm: Lockless get_user_pages_fast() for 64-bit (v3) powerpc: Don't use the wrong thread_struct for ptrace get/set VSX regs powerpc: Fix ptrace buffer size for VSX powerpc: Correctly hookup PTRACE_GET/SETVSRREGS for 32 bit processes ide/powermac: Fix use of uninitialized pointer on media-bay powerpc: Allow non-hcall return values for lparcfg writes ipmi/powerpc: Use linux/of_{device,platform}.h instead of asm powerpc/fsl: proliferate simple-bus compatibility to soc nodes Documentation: remove old sbc8260 board specific information cpm2: Rework baud rate generators configuration to support external clocks. powerpc: rtc_cmos_setup: assign interrupts only if there is i8259 PIC cpm_uart: Add generic clock API support to set baudrates cpm_uart: Modem control lines support powerpc: implement GPIO LIB API on CPM1 Freescale SoC. cpm2: Implement GPIO LIB API on CPM2 Freescale SoC. powerpc: Fix 8xx build failure powerpc: clean up the Book-E HW watchpoint support
| * powerpc/mm: Lockless get_user_pages_fast() for 64-bit (v3)Nick Piggin2008-07-303-1/+285
| | | | | | | | | | | | | | | | | | | | | | | | | | Implement lockless get_user_pages_fast for 64-bit powerpc. Page table existence is guaranteed with RCU, and speculative page references are used to take a reference to the pages without having a prior existence guarantee on them. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc: Don't use the wrong thread_struct for ptrace get/set VSX regsMichael Neuling2008-07-301-2/+2
| | | | | | | | | | | | | | | | In PTRACE_GET/SETVSRREGS, we should be using the thread we are ptracing rather than current. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc: Fix ptrace buffer size for VSXMichael Neuling2008-07-301-4/+2Star
| | | | | | | | | | | | | | Fix cut-and-paste error in the size setting for ptrace buffers for VSX. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc: Correctly hookup PTRACE_GET/SETVSRREGS for 32 bit processesMichael Neuling2008-07-301-0/+2
| | | | | | | | | | | | | | Fix bug where PTRACE_GET/SETVSRREGS are not connected for 32 bit processes. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc: Allow non-hcall return values for lparcfg writesNathan Fontenot2008-07-301-4/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code to handle writes to /proc/ppc64/lparcfg incorrectly assumes that the return code from the helper routines to update processor or memory entitlement return a hcall return value. It then assumes any non-hcall return value is bad and sets the return code for the write to be -EIO. The update_[mp]pp routines can return values other than a hcall return value. This patch removes the automatic setting of any return code that is not an hcall return value from these routines to -EIO. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * Merge commit 'kumar/kumar-next'Benjamin Herrenschmidt2008-07-3037-51/+466
| |\
| | * powerpc/fsl: proliferate simple-bus compatibility to soc nodesKim Phillips2008-07-3031-10/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | add simple-bus compatible property to soc nodes for 83xx/85xx platforms that were missing them. Add same to platform probe code. This fixes SoC device drivers (such as talitos) to succeed in matching devices present in the soc node. also update mpc836x_rdk dts to new SEC bindings (overlooked in commit 3fd4473: powerpc/fsl: update crypto node definition and device tree instances). Signed-off-by: Kim Phillips <kim.phillips@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * cpm2: Rework baud rate generators configuration to support external clocks.Laurent Pinchart2008-07-281-30/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The CPM2 BRG setup functions cpm_setbrg and cpm2_fastbrg don't support external clocks. This patch adds a new exported __cpm2_setbrg function that takes the clock rate and clock source as extra parameters, and moves cpm_setbrg and cpm2_fastbrg to include/asm-powerpc/cpm2.h where they become inline wrappers around __cpm2_setbrg. Signed-off-by: Laurent Pinchart <laurentp@cse-semaphore.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * powerpc: rtc_cmos_setup: assign interrupts only if there is i8259 PICAnton Vorontsov2008-07-281-6/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | i8259 PIC is disabled on MPC8610HPCD boards, thus currently rtc-cmos driver fails to probe. To fix the issue, we lookup the device tree for "chrp,iic" and "pnpPNP,000" compatible devices, and if not found we do not assign RTC IRQ and assuming that i8259 was disabled. Though this patch fixes RTC on some boards (and surely should not break any other), the whole approach is still broken. We can't easily fix this though, because old device trees do not specify i8259 interrupts for the cmos rtc node. Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * cpm_uart: Add generic clock API support to set baudratesLaurent Pinchart2008-07-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces baudrate setting support via the generic clock API. When present the optional device tree clock property is used instead of fsl-cpm-brg. Platforms can then define complex clock schemes, to output the serial clock on an external pin for instance. Signed-off-by: Laurent Pinchart <laurentp@cse-semaphore.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * powerpc: implement GPIO LIB API on CPM1 Freescale SoC.Jochen Friedrich2008-07-282-5/+272
| | | | | | | | | | | | | | | | | | | | | This patch implement GPIO LIB support for the CPM1 GPIOs. Signed-off-by: Jochen Friedrich <jochen@scram.de> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * cpm2: Implement GPIO LIB API on CPM2 Freescale SoC.Laurent Pinchart2008-07-283-0/+136
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implement GPIO LIB support for the CPM2 GPIOs. The code can also be used for CPM1 GPIO port E, as both cores are compatible at the register level. Based on earlier work by Laurent Pinchart. Signed-off-by: Jochen Friedrich <jochen@scram.de> Cc: Laurent Pinchart <laurentp@cse-semaphore.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| | * powerpc: clean up the Book-E HW watchpoint supportKumar Gala2008-07-254-11/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | * CONFIG_BOOKE is selected by CONFIG_44x so we dont need both * Fixed a few comments * Go back to only using DBCR0_IDM to determine if we are using debug resources. Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* | | cpufreq acpi: only call _PPC after cpufreq ACPI init funcs got called alreadyThomas Renninger2008-07-301-0/+6
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ingo Molnar provided a fix to not call _PPC at processor driver initialization time in "[PATCH] ACPI: fix cpufreq regression" (git commit e4233dec749a3519069d9390561b5636a75c7579) But it can still happen that _PPC is called at processor driver initialization time. This patch should make sure that this is not possible anymore. Signed-off-by: Thomas Renninger <trenn@suse.de> Cc: Andi Kleen <andi@firstfloor.org> Cc: Len Brown <lenb@kernel.org> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | powerpc: Disable 64K hugetlb support when doing 64K SPU mappingsBenjamin Herrenschmidt2008-07-281-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | The 64K SPU local store mapping feature is incompatible with the 64K huge pages support due to the inability of some parts of the memory management to differenciate between them while they use a different page table format. For now, disable 64K huge pages when CONFIG_SPU_FS_64K_LS, in the long run, this can be fixed by making this feature use the hugetlb page table format. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powermac: Fixup default serial port device for pmac_zilogBenjamin Herrenschmidt2008-07-282-27/+89
| | | | | | | | | | | | | | | | This removes the non-working code in legacy_serial that tried to handle the powermac SCC ports, and instead add a (now working) function to the powermac platform code to find the default serial console if any. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/powermac: Use sane default baudrate for SCC debuggingBenjamin Herrenschmidt2008-07-281-1/+11
| | | | | | | | | | | | | | | | | | | | | | When using the "sccdbg" option to route early kernel messages and xmon to the SCC serial port on PowerMacs, when this wasn't the configured output port of Open Firmware, we initialize the baudrate to 57600bps. This isn't a very good default on some powermacs where both the FW and pmac_zilog will default to 38400. This fixes it to use the same logic as pmac_zilog to pick a default speed. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: Show processor cache information in sysfsNathan Lynch2008-07-281-0/+309
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Collect cache information from the OF device tree and display it in the cpu hierarchy in sysfs. This is intended to be compatible at the userspace level with x86's implementation[1], hence some of the funny attribute names. The arrangement of cache info is not immediately intuitive, but (again) it's for compatibility's sake. The cache attributes exposed are: type (Data, Instruction, or Unified) level (1, 2, 3...) size coherency_line_size number_of_sets ways_of_associativity All of these can be derived on platforms that follow the OF PowerPC Processor binding. The code "publishes" only those attributes for which it is able to determine values; attributes for values which cannot be determined are not created at all. [1] arch/x86/kernel/cpu/intel_cacheinfo.c BenH: Turned some printk's into pr_debug, added better NULL checking in a couple of places. Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: Make core id information available to userspaceNathan Lynch2008-07-281-0/+23
| | | | | | | | | | | | | | | | | | | | Existing Open Firmware practice is to report each processor core as a separate node in the device tree. Report the value of the "reg" OF property corresponding to a logical CPU's device node as the core_id attribute in /sys/devices/system/cpu/cpu*/topology/core_id. Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: Make core sibling information available to userspaceNathan Lynch2008-07-281-0/+64
| | | | | | | | | | | | | | | | | | | | | | | | Implement the notion of "core siblings" for powerpc. This makes /sys/devices/system/cpu/cpu*/topology/core_siblings present sensible values, indicating online CPUs which share an L2 cache. BenH: Made cpu_to_l2cache() use of_find_node_by_phandle() instead of IBM-specific open coded search Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/vio: More fallout from dma_mapping_error API changeStephen Rothwell2008-07-281-1/+1
| | | | | | | | | | | | | | arch/powerpc/kernel/vio.c:533: error: too few arguments to function 'dma_mapping_error' Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/pseries: Fix CMO sysdev attribute API change falloutStephen Rothwell2008-07-281-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | Noticed due to these wanings: arch/powerpc/platforms/pseries/cmm.c:298: warning: initialization from incompatible pointer type arch/powerpc/platforms/pseries/cmm.c:299: warning: initialization from incompatible pointer type arch/powerpc/platforms/pseries/cmm.c:320: warning: initialization from incompatible pointer type arch/powerpc/platforms/pseries/cmm.c:320: warning: initialization from incompatible pointer type Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: Enable tracehook for the architectureRoland McGrath2008-07-281-0/+1
| | | | | | | | | | | | | | The powerpc arch code has all the prerequisites, so set HAVE_ARCH_TRACEHOOK. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: Add TIF_NOTIFY_RESUME support for tracehookRoland McGrath2008-07-283-5/+15
| | | | | | | | | | | | | | | | | | | | | | This adds TIF_NOTIFY_RESUME support for powerpc. When set, we call tracehook_notify_resume() on the way to user mode. This overloads do_signal() to do the work, but changes its arguments to it has the TIF_* bits handy in a register and drops the useless first argument that was always zero. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: Make syscall tracing use tracehook.h helpersRoland McGrath2008-07-283-27/+34
| | | | | | | | | | | | | | | | | | | | | | | | This changes powerpc syscall tracing to use the new tracehook.h entry points. There is no change, only cleanup. In addition, the assembly changes allow do_syscall_trace_enter() to abort the syscall without losing the information about the original r0 value. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: Call tracehook_signal_handler() when setting up signal framesRoland McGrath2008-07-281-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | This makes the powerpc signal handling code call tracehook_signal_handler() after a handler is set up. This means that using PTRACE_SINGLESTEP to enter a signal handler will report to ptrace on the first instruction of the handler, instead of the second. This is consistent with what x86 and other machines do, and what users and debuggers want. BenH: Fixed up the test for the trap value. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: Update cpu_sibling_maps dynamicallyNathan Lynch2008-07-283-30/+29Star
| | | | | | | | | | | | | | | | | | | | | | | | | | Rather doing one initialization pass over all the per-cpu cpu_sibling_maps at boot, update the maps at cpu online/offline time. This is a behavior change -- the thread_siblings attribute now reflects only online siblings, whereas it would display offline siblings before. The new behavior matches that of x86, and is arguably more useful. Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: register_cpu_online should be __cpuinitNathan Lynch2008-07-281-1/+1
| | | | | | | | | | | | | | | | | | It is called only in cpu online paths. (caught by CONFIG_DEBUG_SECTION_MISMATCH=y) Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: kill useless SMT code in prom_hold_cpusNathan Lynch2008-07-281-36/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | This piece of code is broken for >2 threads, and possibly in some other subtle ways (such as comparing a value obtained from an "ibm,ppc-interrupt-server#s" property to a value obtained from a "reg" property) and doesn't seem to have any useful purpose in the first place other than a dubious warning in case NR_CPUS is too small, which probably isn't the right place to do so. Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: Fix vio build warningsNathan Lynch2008-07-281-2/+2
| | | | | | | | | | | | | | | | arch/powerpc/kernel/vio.c:1034: warning: function declaration isn’t a prototype arch/powerpc/kernel/vio.c:1035: warning: function declaration isn’t a prototype Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc/booke: Clean up the hardware watchpoint supportKumar Gala2008-07-284-11/+12
| | | | | | | | | | | | | | | | | | | | * CONFIG_BOOKE is selected by CONFIG_44x so we dont need both * Fixed a few comments * Go back to only using DBCR0_IDM to determine if we are using debug resources. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | powerpc: Removed duplicated include in stacktrace.cHuang Weiyi2008-07-281-1/+0Star
| | | | | | | | | | | | | | | | Removed duplicated include file <linux/module.h> in arch/powerpc/kernel/stacktrace.c. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | KVM: ppc: fix invalidation of large guest pagesHollis Blanchard2008-07-272-3/+4
| | | | | | | | | | | | | | | | | | | | When guest invalidates a large tlb map, there may be more than one corresponding shadow tlb maps that need to be invalidated. Use eaddr and eend to find these shadow tlb maps. Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
* | powerpc: use generic show_mem()Johannes Weiner2008-07-261-39/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove arch-specific show_mem() in favor of the generic version. This also removes the following redundant information display: - pages in swapcache, printed by show_swap_cache_info() where show_mem() calls show_free_areas(), which calls show_swap_cache_info(). Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | SL*B: drop kmem cache argument from constructorAlexey Dobriyan2008-07-264-24/+13Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kmem cache passed to constructor is only needed for constructors that are themselves multiplexeres. Nobody uses this "feature", nor does anybody uses passed kmem cache in non-trivial way, so pass only pointer to object. Non-trivial places are: arch/powerpc/mm/init_64.c arch/powerpc/mm/hugetlbpage.c This is flag day, yes. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Christoph Lameter <cl@linux-foundation.org> Cc: Jon Tollefson <kniht@linux.vnet.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Matt Mackall <mpm@selenic.com> [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c] [akpm@linux-foundation.org: fix mm/slab.c] [akpm@linux-foundation.org: fix ubifs] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | kexec jumpHuang Ying2008-07-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch provides an enhancement to kexec/kdump. It implements the following features: - Backup/restore memory used by the original kernel before/after kexec. - Save/restore CPU state before/after kexec. The features of this patch can be used as a general method to call program in physical mode (paging turning off). This can be used to call BIOS code under Linux. kexec-tools needs to be patched to support kexec jump. The patches and the precompiled kexec can be download from the following URL: source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2 patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2 binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10 Usage example of calling some physical mode code and return: 1. Compile and install patched kernel with following options selected: CONFIG_X86_32=y CONFIG_KEXEC=y CONFIG_PM=y CONFIG_KEXEC_JUMP=y 2. Build patched kexec-tool or download the pre-built one. 3. Build some physical mode executable named such as "phy_mode" 4. Boot kernel compiled in step 1. 5. Load physical mode executable with /sbin/kexec. The shell command line can be as follow: /sbin/kexec --load-preserve-context --args-none phy_mode 6. Call physical mode executable with following shell command line: /sbin/kexec -e Implementation point: To support jumping without reserving memory. One shadow backup page (source page) is allocated for each page used by kexeced code image (destination page). When do kexec_load, the image of kexeced code is loaded into source pages, and before executing, the destination pages and the source pages are swapped, so the contents of destination pages are backupped. Before jumping to the kexeced code image and after jumping back to the original kernel, the destination pages and the source pages are swapped too. C ABI (calling convention) is used as communication protocol between kernel and called code. A flag named KEXEC_PRESERVE_CONTEXT for sys_kexec_load is added to indicate that the loaded kernel image is used for jumping back. Now, only the i386 architecture is supported. Signed-off-by: Huang Ying <ying.huang@intel.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Nigel Cunningham <nigel@nigel.suspend2.net> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | dma-mapping: add the device argument to dma_mapping_error()FUJITA Tomonori2008-07-263-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER architecture does: This enables us to cleanly fix the Calgary IOMMU issue that some devices are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423). I think that per-device dma_mapping_ops support would be also helpful for KVM people to support PCI passthrough but Andi thinks that this makes it difficult to support the PCI passthrough (see the above thread). So I CC'ed this to KVM camp. Comments are appreciated. A pointer to dma_mapping_ops to struct dev_archdata is added. If the pointer is non NULL, DMA operations in asm/dma-mapping.h use it. If it's NULL, the system-wide dma_ops pointer is used as before. If it's useful for KVM people, I plan to implement a mechanism to register a hook called when a new pci (or dma capable) device is created (it works with hot plugging). It enables IOMMUs to set up an appropriate dma_mapping_ops per device. The major obstacle is that dma_mapping_error doesn't take a pointer to the device unlike other DMA operations. So x86 can't have dma_mapping_ops per device. Note all the POWER IOMMUs use the same dma_mapping_error function so this is not a problem for POWER but x86 IOMMUs use different dma_mapping_error functions. The first patch adds the device argument to dma_mapping_error. The patch is trivial but large since it touches lots of drivers and dma-mapping.h in all the architecture. This patch: dma_mapping_error() doesn't take a pointer to the device unlike other DMA operations. So we can't have dma_mapping_ops per device. Note that POWER already has dma_mapping_ops per device but all the POWER IOMMUs use the same dma_mapping_error function. x86 IOMMUs use device argument. [akpm@linux-foundation.org: fix sge] [akpm@linux-foundation.org: fix svc_rdma] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix bnx2x] [akpm@linux-foundation.org: fix s2io] [akpm@linux-foundation.org: fix pasemi_mac] [akpm@linux-foundation.org: fix sdhci] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix sparc] [akpm@linux-foundation.org: fix ibmvscsi] Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Muli Ben-Yehuda <muli@il.ibm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | powerpc: Fix boot problem due to AT_BASE_PLATFORM changeNathan Lynch2008-07-261-2/+2
|/ | | | | | | | | | | | Commit 9115d13453dee22473a1e8cacc90a8d64a9c4bc9 ("powerpc: Enable AT_BASE_PLATFORM aux vector") broke boot on 32-bit powerpc systems; we have to use PTRRELOC to initialize powerpc_base_platform this early in boot. Bug reported by Jon Smirl. Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* Merge branch 'merge' of ↵Linus Torvalds2008-07-2526-205/+2118
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (34 commits) powerpc: Wireup new syscalls Move update_mmu_cache() declaration from tlbflush.h to pgtable.h powerpc/pseries: Remove kmalloc call in handling writes to lparcfg powerpc/pseries: Update arch vector to indicate support for CMO ibmvfc: Add support for collaborative memory overcommit ibmvscsi: driver enablement for CMO ibmveth: enable driver for CMO ibmveth: Automatically enable larger rx buffer pools for larger mtu powerpc/pseries: Verify CMO memory entitlement updates with virtual I/O powerpc/pseries: vio bus support for CMO powerpc/pseries: iommu enablement for CMO powerpc/pseries: Add CMO paging statistics powerpc/pseries: Add collaborative memory manager powerpc/pseries: Utilities to set firmware page state powerpc/pseries: Enable CMO feature during platform setup powerpc/pseries: Split retrieval of processor entitlement data into a helper routine powerpc/pseries: Add memory entitlement capabilities to /proc/ppc64/lparcfg powerpc/pseries: Split processor entitlement retrieval and gathering to helper routines powerpc/pseries: Remove extraneous error reporting for hcall failures in lparcfg powerpc: Fix compile error with binutils 2.15 ... Fixed up conflict in arch/powerpc/platforms/52xx/Kconfig manually.
| * powerpc/pseries: Remove kmalloc call in handling writes to lparcfgNathan Fontenot2008-07-251-16/+12Star
| | | | | | | | | | | | | | | | | | | | | | There are only 4 valid name=value pairs for writes to /proc/ppc64/lparcfg. Current code allocates a buffer to copy this information in from the user. Since the longest name=value pair will easily fit into a buffer of 64 characters, simply put the buffer on the stack instead of allocating the buffer. Signed-off-by: Nathan Fotenot <nfont@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc/pseries: Update arch vector to indicate support for CMONathan Fontenot2008-07-251-1/+8
| | | | | | | | | | | | | | | | | | | | Update the architecture vector to indicate that Cooperative Memory Overcommitment is supported if CONFIG_PPC_SMLPAR is set. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc/pseries: Verify CMO memory entitlement updates with virtual I/ONathan Fontenot2008-07-251-0/+10
| | | | | | | | | | | | | | | | | | Verify memory entitlement updates can be handled by vio. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc/pseries: vio bus support for CMORobert Jennings2008-07-251-6/+1027
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a large patch but the normal code path is not affected. For non-pSeries platforms the code is ifdef'ed out and for non-CMO enabled pSeries systems this does not affect the normal code path. Devices that do not perform DMA operations do not need modification with this patch. The function get_desired_dma was renamed from get_io_entitlement for clarity. Overview Cooperative Memory Overcommitment (CMO) allows for a set of OS partitions to be run with less RAM than the aggregate needs of the group of partitions. The firmware will balance memory between the partitions and page in/out memory as needed. Based on the number and type of IO adpaters preset each partition is allocated an amount of memory for DMA operations and this allocation will be guaranteed to the partition; this is referred to as the partition's 'entitlement'. Partitions running in a CMO environment can only have virtual IO devices present. The VIO bus layer will manage the IO entitlement for the system. Accounting, at a system and per-device level, is tracked in the VIO bus code and exposed via sysfs. A set of dma_ops functions are added to the bus to allow for this accounting. Bus initialization At initialization, the bus will calculate the minimum needs of the system based on providing each device present with a standard minimum entitlement along with a spare allocation for the bus to handle hotplug events. If the minimum needs can not be met the system boot will be halted. Device changes The significant changes for devices while running under CMO are that the devices must specify how much dedicated IO entitlement they desire and must also handle DMA mapping errors that can occur due to constrained IO memory. The virtual IO drivers are modified to silence errors when DMA mappings fail for CMO and handle these failures gracefully. Each devices will be guaranteed a minimum entitlement that can always be mapped. Devices will specify how much entitlement they desire and the VIO bus will attempt to provide for this. Devices can change their desired entitlement level at any point in time to address particular needs (via vio_cmo_set_dev_desired()), not just at device probe time. VIO bus changes The system will have a particular entitlement level available from which it can provide memory to the devices. The bus defines two pools of memory within this entitlement, the reserved and excess pools. Each device is provided with it's own entitlement no less than a system defined minimum entitlement and no greater than what the device has specified as it's desired entitlement. The entitlement provided to devices comes from the reserve pool. The reserve pool can also contain a spare allocation as large as the system defined minimum entitlement which is used for device hotplug events. Any entitlement not needed to fulfill the needs of a reserve pool is placed in the excess pool. Each device is guaranteed that it can map up to it's entitled level; additional mapping are possible as long as there is unmapped memory in the excess pool. Bus probe As the system starts, each device is given an entitlement equal only to the system defined minimum entitlement. The reserve pool is equal to the sum of these entitlements, plus a spare allocation. The VIO bus also tracks the aggregate desired entitlement of all the devices. If the system desired entitlement is greater than the size of the reserve pool, when devices unmap IO memory it will be reserved and a balance operation will be scheduled for some time in the future. Entitlement balancing The balance function tries to fairly distribute entitlement between the devices in the system with the goal of providing each device with it's desired amount of entitlement. Devices using more than what would be ideal will have their entitled set-point adjusted; this will effectively set a goal for lower IO memory usage as future mappings can fail and deallocations will trigger a balance operation to distribute the newly unmapped memory. A fair distribution of entitlement can take several balance operations to achieve. Entitlement changes and device DLPAR events will alter the state of CMO and will trigger balance operations. Hotplug events The VIO bus allows for changes in system entitlement at run-time via 'vio_cmo_entitlement_update()'. When devices are added the hotplug device event will be preceded by a system entitlement increase and this is reversed when devices are removed. The following changes are made that the VIO bus layer for CMO: * add IO memory accounting per device structure. * add IO memory entitlement query function to driver structure. * during vio bus probe, if CMO is enabled, check that driver has memory entitlement query function defined. Fail if function not defined. * fail to register driver if io entitlement function not defined. * create set of dma_ops at vio level for CMO that will track allocations and return DMA failures once entitlement is reached. Entitlement will limited by overall system entitlement. Devices will have a reserved quantity of memory that is guaranteed, the rest can be used as available. * expose entitlement, current allocation, desired allocation, and the allocation error counter for devices to the user through sysfs * provide mechanism for changing a device's desired entitlement at run time for devices as an exported function and sysfs tunable * track any DMA failures for entitled IO memory for each vio device. * check entitlement against available system entitlement on device add * track entitlement metrics (high water mark, current usage) * provide function to reset high water mark * provide minimum and desired entitlement numbers at a bus level * provide drivers with a minimum guaranteed entitlement * balance available entitlement between devices to satisfy their needs * handle system entitlement changes and device hotplug Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc/pseries: iommu enablement for CMORobert Jennings2008-07-256-19/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To support Cooperative Memory Overcommitment (CMO), we need to check for failure from some of the tce hcalls. These changes for the pseries platform affect the powerpc architecture; patches for the other affected platforms are included in this patch. pSeries platform IOMMU code changes: * platform TCE functions must handle H_NOT_ENOUGH_RESOURCES errors and return an error. Architecture IOMMU code changes: * Calls to ppc_md.tce_build need to check return values and return DMA_MAPPING_ERROR for transient errors. Architecture changes: * struct machdep_calls for tce_build*_pSeriesLP functions need to change to indicate failure. * all other platforms will need updates to iommu functions to match the new calling semantics; they will return 0 on success. The other platforms default configs have been built, but no further testing was performed. Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc/pseries: Add CMO paging statisticsBrian King2008-07-251-0/+20
| | | | | | | | | | | | | | | | | | | | | | With the addition of Cooperative Memory Overcommitment (CMO) support for IBM Power Systems, two fields have been added to the VPA to report paging statistics. Add support in lparcfg to report them to userspace. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc/pseries: Add collaborative memory managerBrian King2008-07-253-0/+492
| | | | | | | | | | | | | | | | | | | | | | | | | | Adds a collaborative memory manager, which acts as a simple balloon driver for System p machines that support cooperative memory overcommitment (CMO). Adds a platform configuration option for CMO called PPC_SMLPAR. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc/pseries: Utilities to set firmware page stateBrian King2008-07-251-0/+10
| | | | | | | | | | | | | | | | | | | | | | Newer versions of firmware support page states, which are used by the collaborative memory manager (future patch) to "loan" pages to the hypervisor for use by other partitions. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc/pseries: Enable CMO feature during platform setupRobert Jennings2008-07-251-0/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | For Cooperative Memory Overcommitment (CMO), set the FW_FEATURE_CMO flag in powerpc_firmware_features from the rtas ibm,get-system-parameters table prior to calling iommu_init_early_pSeries. With this, any CMO specific functionality can be controlled by checking: firmware_has_feature(FW_FEATURE_CMO) Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc/pseries: Split retrieval of processor entitlement data into a helper ↵Robert Jennings2008-07-251-35/+46
| | | | | | | | | | | | | | | | | | | | | | | | routine Split the retrieval of processor entitlement data returned in the H_GET_PPP hcall into its own helper routine. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>