summaryrefslogtreecommitdiffstats
path: root/arch/powerpc/include
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'pm+acpi-3.15-rc1-3' of ↵Linus Torvalds2014-04-111-0/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more ACPI and power management fixes and updates from Rafael Wysocki: "This is PM and ACPI material that has emerged over the last two weeks and one fix for a CPU hotplug regression introduced by the recent CPU hotplug notifiers registration series. Included are intel_idle and turbostat updates from Len Brown (these have been in linux-next for quite some time), a new cpufreq driver for powernv (that might spend some more time in linux-next, but BenH was asking me so nicely to push it for 3.15 that I couldn't resist), some cpufreq fixes and cleanups (including fixes for some silly breakage in a couple of cpufreq drivers introduced during the 3.14 cycle), assorted ACPI cleanups, wakeup framework documentation fixes, a new sysfs attribute for cpuidle and a new command line argument for power domains diagnostics. Specifics: - Fix for a recently introduced CPU hotplug regression in ARM KVM from Ming Lei. - Fixes for breakage in the at32ap, loongson2_cpufreq, and unicore32 cpufreq drivers introduced during the 3.14 cycle (-stable material) from Chen Gang and Viresh Kumar. - New powernv cpufreq driver from Vaidyanathan Srinivasan, with bits from Gautham R Shenoy and Srivatsa S Bhat. - Exynos cpufreq driver fix preventing it from being included into multiplatform builds that aren't supported by it from Sachin Kamat. - cpufreq cleanups related to the usage of the driver_data field in struct cpufreq_frequency_table from Viresh Kumar. - cpufreq ppc driver cleanup from Sachin Kamat. - Intel BayTrail support for intel_idle and ACPI idle from Len Brown. - Intel CPU model 54 (Atom N2000 series) support for intel_idle from Jan Kiszka. - intel_idle fix for Intel Ivy Town residency targets from Len Brown. - turbostat updates (Intel Broadwell support and output cleanups) from Len Brown. - New cpuidle sysfs attribute for exporting C-states' target residency information to user space from Daniel Lezcano. - New kernel command line argument to prevent power domains enabled by the bootloader from being turned off even if they are not in use (for diagnostics purposes) from Tushar Behera. - Fixes for wakeup sysfs attributes documentation from Geert Uytterhoeven. - New ACPI video blacklist entry for ThinkPad Helix from Stephen Chandler Paul. - Assorted ACPI cleanups and a Kconfig help update from Jonghwan Choi, Zhihui Zhang, Hanjun Guo" * tag 'pm+acpi-3.15-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (28 commits) ACPI: Update the ACPI spec information in Kconfig arm, kvm: fix double lock on cpu_add_remove_lock cpuidle: sysfs: Export target residency information cpufreq: ppc: Remove duplicate inclusion of fsl_soc.h cpufreq: create another field .flags in cpufreq_frequency_table cpufreq: use kzalloc() to allocate memory for cpufreq_frequency_table cpufreq: don't print value of .driver_data from core cpufreq: ia64: don't set .driver_data to index cpufreq: powernv: Select CPUFreq related Kconfig options for powernv cpufreq: powernv: Use cpufreq_frequency_table.driver_data to store pstate ids cpufreq: powernv: cpufreq driver for powernv platform cpufreq: at32ap: don't declare local variable as static cpufreq: loongson2_cpufreq: don't declare local variable as static cpufreq: unicore32: fix typo issue for 'clk' cpufreq: exynos: Disable on multiplatform build PM / wakeup: Correct presence vs. emptiness of wakeup_* attributes PM / domains: Add pd_ignore_unused to keep power domains enabled ACPI / dock: Drop dock_device_ids[] table ACPI / video: Favor native backlight interface for ThinkPad Helix ACPI / thermal: Fix wrong variable usage in debug statement ...
| * Merge branch 'pm-cpufreq'Rafael J. Wysocki2014-04-081-0/+4
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * pm-cpufreq: cpufreq: ppc: Remove duplicate inclusion of fsl_soc.h cpufreq: create another field .flags in cpufreq_frequency_table cpufreq: use kzalloc() to allocate memory for cpufreq_frequency_table cpufreq: don't print value of .driver_data from core cpufreq: ia64: don't set .driver_data to index cpufreq: powernv: Select CPUFreq related Kconfig options for powernv cpufreq: powernv: Use cpufreq_frequency_table.driver_data to store pstate ids cpufreq: powernv: cpufreq driver for powernv platform cpufreq: at32ap: don't declare local variable as static cpufreq: loongson2_cpufreq: don't declare local variable as static cpufreq: unicore32: fix typo issue for 'clk' cpufreq: exynos: Disable on multiplatform build
| | * cpufreq: powernv: cpufreq driver for powernv platformVaidyanathan Srinivasan2014-04-071-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backend driver to dynamically set voltage and frequency on IBM POWER non-virtualized platforms. Power management SPRs are used to set the required PState. This driver works in conjunction with cpufreq governors like 'ondemand' to provide a demand based frequency and voltage setting on IBM POWER non-virtualized platforms. PState table is obtained from OPAL v3 firmware through device tree. powernv_cpufreq back-end driver would parse the relevant device-tree nodes and initialise the cpufreq subsystem on powernv platform. The code was originally written by svaidy@linux.vnet.ibm.com. Over time it was modified to accomodate bug-fixes as well as updates to the the cpu-freq core. Relevant portions of the change logs corresponding to those modifications are noted below: * The policy->cpus needs to be populated in a hotplug-invariant manner instead of using cpu_sibling_mask() which varies with cpu-hotplug. This is because the cpufreq core code copies this content into policy->related_cpus mask which should not vary on cpu-hotplug. [Authored by srivatsa.bhat@linux.vnet.ibm.com] * Create a helper routine that can return the cpu-frequency for the corresponding pstate_id. Also, cache the values of the pstate_max, pstate_min and pstate_nominal and nr_pstates in a static structure so that they can be reused in the future to perform any validations. [Authored by ego@linux.vnet.ibm.com] * Create a driver attribute named cpuinfo_nominal_freq which creates a sysfs read-only file named cpuinfo_nominal_freq. Export the frequency corresponding to the nominal_pstate through this interface. Nominal frequency is the highest non-turbo frequency for the platform. This is generally used for setting governor policies from user space for optimal energy efficiency. [Authored by ego@linux.vnet.ibm.com] * Implement a powernv_cpufreq_get(unsigned int cpu) method which will return the current operating frequency. Export this via the sysfs interface cpuinfo_cur_freq by setting powernv_cpufreq_driver.get to powernv_cpufreq_get(). [Authored by ego@linux.vnet.ibm.com] [Change log updated by ego@linux.vnet.ibm.com] Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* | | Merge branch 'merge' of ↵Linus Torvalds2014-04-093-38/+109
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull more powerpc updates from Ben Herrenschmidt: "Here are a few more powerpc things for you. So you'll find here the conversion of the two new firmware sysfs interfaces to the new API for self-removing files that Greg and Tejun introduced, so they can finally remove the old one. I'm also reverting the hwmon driver for powernv. I shouldn't have merged it, I got a bit carried away here. I hadn't realized it was never CCed to the relevant maintainer(s) and list(s), and happens to have some issues so I'm taking it out and it will come back via the proper channels. The rest is a bunch of LE fixes (argh, some of the new stuff was broken on LE, I really need to start testing LE myself !) and various random fixes here and there. Finally one bit that's not strictly a fix, which is the HVC OPAL change to "kick" the HVC thread when the firmware tells us there is new incoming data. I don't feel like waiting for this one, it's simple enough, and it makes a big difference in console responsiveness which is good for my nerves" * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (26 commits) powerpc/powernv Adapt opal-elog and opal-dump to new sysfs_remove_file_self Revert "powerpc/powernv: hwmon driver for power values, fan rpm and temperature" power, sched: stop updating inside arch_update_cpu_topology() when nothing to be update powerpc/le: Avoid creatng R_PPC64_TOCSAVE relocations for modules. arch/powerpc: Use RCU_INIT_POINTER(x, NULL) in platforms/cell/spu_syscalls.c powerpc/opal: Add missing include powerpc: Convert last uses of __FUNCTION__ to __func__ powerpc: Add lq/stq emulation powerpc/powernv: Add invalid OPAL call powerpc/powernv: Add OPAL message log interface powerpc/book3s: Fix mc_recoverable_range buffer overrun issue. powerpc: Remove dead code in sycall entry powerpc: Use of_node_init() for the fakenode in msi_bitmap.c powerpc/mm: NUMA pte should be handled via slow path in get_user_pages_fast() powerpc/powernv: Fix endian issues with sensor code powerpc/powernv: Fix endian issues with OPAL async code tty/hvc_opal: Kick the HVC thread on OPAL console events powerpc/powernv: Add opal_notifier_unregister() and export to modules powerpc/ppc64: Do not turn AIL (reloc-on interrupts) too early powerpc/ppc64: Gracefully handle early interrupts ...
| * | | powerpc/opal: Add missing includeMichael Neuling2014-04-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | next-20140324 currently fails compiling celleb_defconfig with: arch/powerpc/include/asm/opal.h:894:42: error: 'struct notifier_block' declared inside parameter list [-Werror] arch/powerpc/include/asm/opal.h:894:42: error: its scope is only this definition or declaration, which is probably not what you want [-Werror] arch/powerpc/include/asm/opal.h:896:14: error: 'struct notifier_block' declared inside parameter list [-Werror] This is due to a missing include which is added here. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | powerpc: Add lq/stq emulationAnton Blanchard2014-04-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recent CPUs support quad word load and store instructions. Add support to the alignment handler for them. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | powerpc/powernv: Add invalid OPAL callJoel Stanley2014-04-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This call will not be understood by OPAL, and cause it to add an error to it's log. Among other things, this is useful for testing the behaviour of the log as it fills up. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | powerpc/powernv: Add OPAL message log interfaceJoel Stanley2014-04-091-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | OPAL provides an in-memory circular buffer containing a message log populated with various runtime messages produced by the firmware. Provide a sysfs interface /sys/firmware/opal/msglog for userspace to view the messages. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | powerpc/powernv: Fix endian issues with sensor codeAnton Blanchard2014-04-091-2/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | One OPAL call and one device tree property needed byte swapping. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | powerpc/powernv: Fix endian issues with OPAL async codeAnton Blanchard2014-04-071-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | OPAL defines opal_msg as a big endian struct so we have to byte swap it on little endian builds. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | powerpc/powernv: Add opal_notifier_unregister() and export to modulesBenjamin Herrenschmidt2014-04-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | opal_notifier_register() is missing a pending "unregister" variant and should be exposed to modules. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | powerpc/le: Enable RTAS events supportGreg Kurz2014-04-071-33/+94
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current kernel code assumes big endian and parses RTAS events all wrong. The most visible effect is that we cannot honor EPOW events, meaning, for example, we cannot shut down a guest properly from the hypervisor. This new patch is largely inspired by Nathan's work: we get rid of all the bit fields in the RTAS event structures (even the unused ones, for consistency). We also introduce endian safe accessors for the fields used by the kernel (trivial rtas_error_type() accessor added for consistency). Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* / / include/linux/crash_dump.h: add vmcore_cleanup() prototypeRashika Kheria2014-04-081-1/+0Star
|/ / | | | | | | | | | | | | | | | | | | | | | | | | Eliminate the following warning in proc/vmcore.c: fs/proc/vmcore.c:1088:6: warning: no previous prototype for `vmcore_cleanup' [-Wmissing-prototypes] [akpm@linux-foundation.org: clean up powerpc, remove unneeded EXPORT_SYMBOL] Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge tag 'random_for_linus' of ↵Linus Torvalds2014-04-051-0/+18
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random Pull /dev/random changes from Ted Ts'o: "A number of cleanups plus support for the RDSEED instruction, which will be showing up in Intel Broadwell CPU's" * tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random: random: Add arch_has_random[_seed]() random: If we have arch_get_random_seed*(), try it before blocking random: Use arch_get_random_seed*() at init time and once a second x86, random: Enable the RDSEED instruction random: use the architectural HWRNG for the SHA's IV in extract_buf() random: clarify bits/bytes in wakeup thresholds random: entropy_bytes is actually bits random: simplify accounting code random: tighten bound on random_read_wakeup_thresh random: forget lock in lockless accounting random: simplify accounting logic random: fix comment on "account" random: simplify loop in random_read random: fix description of get_random_bytes random: fix comment on proc_do_uuid random: fix typos / spelling errors in comments
| * | random: Add arch_has_random[_seed]()H. Peter Anvin2014-03-201-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add predicate functions for having arch_get_random[_seed]*(). The only current use is to avoid the loop in arch_random_refill() when arch_get_random_seed_long() is unavailable. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
| * | x86, random: Enable the RDSEED instructionH. Peter Anvin2014-03-201-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upcoming Intel silicon adds a new RDSEED instruction, which is similar to RDRAND but provides a stronger guarantee: unlike RDRAND, RDSEED will always reseed the PRNG from the true random number source between each read. Thus, the output of RDSEED is guaranteed to be 100% entropic, unlike RDRAND which is only architecturally guaranteed to be 1/512 entropic (although in practice is much more.) The RDSEED instruction takes the same time to execute as RDRAND, but RDSEED unlike RDRAND can legitimately return failure (CF=0) due to entropy exhaustion if too many threads on too many cores are hammering the RDSEED instruction at the same time. Therefore, we have to be more conservative and only use it in places where we can tolerate failures. This patch introduces the primitives arch_get_random_seed_{int,long}() but does not use it yet. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | Merge tag 'kvm-3.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2014-04-026-1/+25
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull kvm updates from Paolo Bonzini: "PPC and ARM do not have much going on this time. Most of the cool stuff, instead, is in s390 and (after a few releases) x86. ARM has some caching fixes and PPC has transactional memory support in guests. MIPS has some fixes, with more probably coming in 3.16 as QEMU will soon get support for MIPS KVM. For x86 there are optimizations for debug registers, which trigger on some Windows games, and other important fixes for Windows guests. We now expose to the guest Broadwell instruction set extensions and also Intel MPX. There's also a fix/workaround for OS X guests, nested virtualization features (preemption timer), and a couple kvmclock refinements. For s390, the main news is asynchronous page faults, together with improvements to IRQs (floating irqs and adapter irqs) that speed up virtio devices" * tag 'kvm-3.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (96 commits) KVM: PPC: Book3S HV: Save/restore host PMU registers that are new in POWER8 KVM: PPC: Book3S HV: Fix decrementer timeouts with non-zero TB offset KVM: PPC: Book3S HV: Don't use kvm_memslots() in real mode KVM: PPC: Book3S HV: Return ENODEV error rather than EIO KVM: PPC: Book3S: Trim top 4 bits of physical address in RTAS code KVM: PPC: Book3S HV: Add get/set_one_reg for new TM state KVM: PPC: Book3S HV: Add transactional memory support KVM: Specify byte order for KVM_EXIT_MMIO KVM: vmx: fix MPX detection KVM: PPC: Book3S HV: Fix KVM hang with CONFIG_KVM_XICS=n KVM: PPC: Book3S: Introduce hypervisor call H_GET_TCE KVM: PPC: Book3S HV: Fix incorrect userspace exit on ioeventfd write KVM: s390: clear local interrupts at cpu initial reset KVM: s390: Fix possible memory leak in SIGP functions KVM: s390: fix calculation of idle_mask array size KVM: s390: randomize sca address KVM: ioapic: reinject pending interrupts on KVM_SET_IRQCHIP KVM: Bump KVM_MAX_IRQ_ROUTES for s390 KVM: s390: irq routing for adapter interrupts. KVM: s390: adapter interrupt sources ...
| * | Merge branch 'kvm-ppchv-next' of ↵Paolo Bonzini2014-03-296-1/+25
| |\ \ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-next
| | * | KVM: PPC: Book3S HV: Save/restore host PMU registers that are new in POWER8Paul Mackerras2014-03-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we save the host PMU configuration, counter values, etc., when entering a guest, and restore it on return from the guest. (We have to do this because the guest has control of the PMU while it is executing.) However, we missed saving/restoring the SIAR and SDAR registers, as well as the registers which are new on POWER8, namely SIER and MMCR2. This adds code to save the values of these registers when entering the guest and restore them on exit. This also works around the bug in POWER8 where setting PMAE with a counter already negative doesn't generate an interrupt. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Scott Wood <scottwood@freescale.com>
| | * | KVM: PPC: Book3S HV: Don't use kvm_memslots() in real modePaul Mackerras2014-03-291-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With HV KVM, some high-frequency hypercalls such as H_ENTER are handled in real mode, and need to access the memslots array for the guest. Accessing the memslots array is safe, because we hold the SRCU read lock for the whole time that a guest vcpu is running. However, the checks that kvm_memslots() does when lockdep is enabled are potentially unsafe in real mode, when only the linear mapping is available. Furthermore, kvm_memslots() can be called from a secondary CPU thread, which is an offline CPU from the point of view of the host kernel, and is not running the task which holds the SRCU read lock. To avoid false positives in the checks in kvm_memslots(), and to avoid possible side effects from doing the checks in real mode, this replaces kvm_memslots() with kvm_memslots_raw() in all the places that execute in real mode. kvm_memslots_raw() is a new function that is like kvm_memslots() but uses rcu_dereference_raw_notrace() instead of kvm_dereference_check(). Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Scott Wood <scottwood@freescale.com>
| | * | KVM: PPC: Book3S HV: Add transactional memory supportMichael Neuling2014-03-292-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds saving of the transactional memory (TM) checkpointed state on guest entry and exit. We only do this if we see that the guest has an active transaction. It also adds emulation of the TM state changes when delivering IRQs into the guest. According to the architecture, if we are transactional when an IRQ occurs, the TM state is changed to suspended, otherwise it's left unchanged. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Scott Wood <scottwood@freescale.com>
| | * | KVM: PPC: Book3S: Introduce hypervisor call H_GET_TCELaurent Dufour2014-03-261-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces the H_GET_TCE hypervisor call, which is basically the reverse of H_PUT_TCE, as defined in the Power Architecture Platform Requirements (PAPR). The hcall H_GET_TCE is required by the kdump kernel, which uses it to retrieve TCEs set up by the previous (panicked) kernel. Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
| | * | KVM: PPC: Book3S HV: Fix incorrect userspace exit on ioeventfd writeGreg Kurz2014-03-261-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the guest does an MMIO write which is handled successfully by an ioeventfd, ioeventfd_write() returns 0 (success) and kvmppc_handle_store() returns EMULATE_DONE. Then kvmppc_emulate_mmio() converts EMULATE_DONE to RESUME_GUEST_NV and this causes an exit from the loop in kvmppc_vcpu_run_hv(), causing an exit back to userspace with a bogus exit reason code, typically causing userspace (e.g. qemu) to crash with a message about an unknown exit code. This adds handling of RESUME_GUEST_NV in kvmppc_vcpu_run_hv() in order to fix that. For generality, we define a helper to check for either of the return-to-guest codes we use, RESUME_GUEST and RESUME_GUEST_NV, to make it easy to check for either and provide one place to update if any other return-to-guest code gets defined in future. Since it only affects Book3S HV for now, the helper is added to the kvm_book3s.h header file. We use the helper in two places in kvmppc_run_core() as well for future-proofing, though we don't see RESUME_GUEST_NV in either place at present. [paulus@samba.org - combined 4 patches into one, rewrote description] Suggested-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | | | Merge branch 'powernv-cpuidle' of ↵Linus Torvalds2014-04-024-1/+5
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc non-virtualized cpuidle from Ben Herrenschmidt: "This is the branch I mentioned in my other pull request which contains our improved cpuidle support for the "powernv" platform (non-virtualized). It adds support for the "fast sleep" feature of the processor which provides higher power savings than our usual "nap" mode but at the cost of losing the timers while asleep, and thus exploits the new timer broadcast framework to work around that limitation. It's based on a tip timer tree that you seem to have already merged" * 'powernv-cpuidle' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: cpuidle/powernv: Parse device tree to setup idle states cpuidle/powernv: Add "Fast-Sleep" CPU idle state powerpc/powernv: Add OPAL call to resync timebase on wakeup powerpc/powernv: Add context management for Fast Sleep powerpc: Split timer_interrupt() into timer handling and interrupt handling routines powerpc: Implement tick broadcast IPI as a fixed IPI message powerpc: Free up the slot of PPC_MSG_CALL_FUNC_SINGLE IPI message
| * | | | powerpc/powernv: Add OPAL call to resync timebase on wakeupVaidyanathan Srinivasan2014-03-051-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During "Fast-sleep" and deeper power savings state, decrementer and timebase could be stopped making it out of sync with rest of the cores in the system. Add a firmware call to request platform to resync timebase using low level platform methods. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | powerpc/powernv: Add context management for Fast SleepVaidyanathan Srinivasan2014-03-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before adding Fast-Sleep into the cpuidle framework, some low level support needs to be added to enable it. This includes saving and restoring of certain registers at entry and exit time of this state respectively just like we do in the NAP idle state. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> [Changelog modified by Preeti U. Murthy <preeti@linux.vnet.ibm.com>] Signed-off-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | powerpc: Implement tick broadcast IPI as a fixed IPI messageSrivatsa S. Bhat2014-03-052-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For scalability and performance reasons, we want the tick broadcast IPIs to be handled as efficiently as possible. Fixed IPI messages are one of the most efficient mechanisms available - they are faster than the smp_call_function mechanism because the IPI handlers are fixed and hence they don't involve costly operations such as adding IPI handlers to the target CPU's function queue, acquiring locks for synchronization etc. Luckily we have an unused IPI message slot, so use that to implement tick broadcast IPIs efficiently. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> [Functions renamed to tick_broadcast* and Changelog modified by Preeti U. Murthy<preeti@linux.vnet.ibm.com>] Signed-off-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com> Acked-by: Geoff Levand <geoff@infradead.org> [For the PS3 part] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | powerpc: Free up the slot of PPC_MSG_CALL_FUNC_SINGLE IPI messageSrivatsa S. Bhat2014-03-051-1/+1
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The IPI handlers for both PPC_MSG_CALL_FUNC and PPC_MSG_CALL_FUNC_SINGLE map to a common implementation - generic_smp_call_function_single_interrupt(). So, we can consolidate them and save one of the IPI message slots, (which are precious on powerpc, since only 4 of those slots are available). So, implement the functionality of PPC_MSG_CALL_FUNC_SINGLE using PPC_MSG_CALL_FUNC itself and release its IPI message slot, so that it can be used for something else in the future, if desired. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com> Acked-by: Geoff Levand <geoff@infradead.org> [For the PS3 part] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* | | | Merge branch 'next' of ↵Linus Torvalds2014-04-0214-37/+126
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull main powerpc updates from Ben Herrenschmidt: "This time around, the powerpc merges are going to be a little bit more complicated than usual. This is the main pull request with most of the work for this merge window. I will describe it a bit more further down. There is some additional cpuidle driver work, however I haven't included it in this tree as it depends on some work in tip/timer-core which Thomas accidentally forgot to put in a topic branch. Since I didn't want to carry all of that tip timer stuff in powerpc -next, I setup a separate branch on top of Thomas tree with just that cpuidle driver in it, and Stephen has been carrying that in next separately for a while now. I'll send a separate pull request for it. Additionally, two new pieces in this tree add users for a sysfs API that Tejun and Greg have been deprecating in drivers-core-next. Thankfully Greg reverted the patch that removes the old API so this merge can happen cleanly, but once merged, I will send a patch adjusting our new code to the new API so that Greg can send you the removal patch. Now as for the content of this branch, we have a lot of perf work for power8 new counters including support for our new "nest" counters (also called 24x7) under pHyp (not natively yet). We have new functionality when running under the OPAL firmware (non-virtualized or KVM host), such as access to the firmware error logs and service processor dumps, system parameters and sensors, along with a hwmon driver for the latter. There's also a bunch of bug fixes accross the board, some LE fixes, and a nice set of selftests for validating our various types of copy loops. On the Freescale side, we see mostly new chip/board revisions, some clock updates, better support for machine checks and debug exceptions, etc..." * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (70 commits) powerpc/book3s: Fix CFAR clobbering issue in machine check handler. powerpc/compat: 32-bit little endian machine name is ppcle, not ppc powerpc/le: Big endian arguments for ppc_rtas() powerpc: Use default set of netfilter modules (CONFIG_NETFILTER_ADVANCED=n) powerpc/defconfigs: Enable THP in pseries defconfig powerpc/mm: Make sure a local_irq_disable prevent a parallel THP split powerpc: Rate-limit users spamming kernel log buffer powerpc/perf: Fix handling of L3 events with bank == 1 powerpc/perf/hv_{gpci, 24x7}: Add documentation of device attributes powerpc/perf: Add kconfig option for hypervisor provided counters powerpc/perf: Add support for the hv 24x7 interface powerpc/perf: Add support for the hv gpci (get performance counter info) interface powerpc/perf: Add macros for defining event fields & formats powerpc/perf: Add a shared interface to get gpci version and capabilities powerpc/perf: Add 24x7 interface headers powerpc/perf: Add hv_gpci interface header powerpc: Add hvcalls for 24x7 and gpci (Get Performance Counter Info) sysfs: create bin_attributes under the requested group powerpc/perf: Enable BHRB access for EBB events powerpc/perf: Add BHRB constraint and IFM MMCRA handling for EBB ...
| * \ \ \ Merge remote-tracking branch 'scott/next' into nextBenjamin Herrenschmidt2014-03-245-33/+32Star
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Freescale updates from Scott. Mostly support for critical and machine check exceptions on 64-bit BookE, some new PCI suspend/resume work and misc bits.
| | * | | | powerpc/booke64: Critical and machine check exception supportScott Wood2014-03-201-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add special state saving for critical and machine check exceptions. Most of this code could be used to handle debug exceptions taken from kernel space, but actually doing so is outside the scope of this patch. The various critical and machine check exceptions now point to their real handlers, rather than hanging the kernel. Signed-off-by: Scott Wood <scottwood@freescale.com>
| | * | | | powerpc/booke64: Use SPRG_TLB_EXFRAME on bolted handlersScott Wood2014-03-202-13/+5Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While bolted handlers (including e6500) do not need to deal with a TLB miss recursively causing another TLB miss, nested TLB misses can still happen with crit/mc/debug exceptions -- so we still need to honor SPRG_TLB_EXFRAME. We don't need to spend time modifying it in the TLB miss fastpath, though -- the special level exception will handle that. Signed-off-by: Scott Wood <scottwood@freescale.com> Cc: Mihai Caraman <mihai.caraman@freescale.com> Cc: kvm-ppc@vger.kernel.org
| | * | | | powerpc/booke64: Use SPRG7 for VDSOScott Wood2014-03-204-15/+14Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously SPRG3 was marked for use by both VDSO and critical interrupts (though critical interrupts were not fully implemented). In commit 8b64a9dfb091f1eca8b7e58da82f1e7d1d5fe0ad ("powerpc/booke64: Use SPRG0/3 scratch for bolted TLB miss & crit int"), Mihai Caraman made an attempt to resolve this conflict by restoring the VDSO value early in the critical interrupt, but this has some issues: - It's incompatible with EXCEPTION_COMMON which restores r13 from the by-then-overwritten scratch (this cost me some debugging time). - It forces critical exceptions to be a special case handled differently from even machine check and debug level exceptions. - It didn't occur to me that it was possible to make this work at all (by doing a final "ld r13, PACA_EXCRIT+EX_R13(r13)") until after I made (most of) this patch. :-) It might be worth investigating using a load rather than SPRG on return from all exceptions (except TLB misses where the scratch never leaves the SPRG) -- it could save a few cycles. Until then, let's stick with SPRG for all exceptions. Since we cannot use SPRG4-7 for scratch without corrupting the state of a KVM guest, move VDSO to SPRG7 on book3e. Since neither SPRG4-7 nor critical interrupts exist on book3s, SPRG3 is still used for VDSO there. Signed-off-by: Scott Wood <scottwood@freescale.com> Cc: Mihai Caraman <mihai.caraman@freescale.com> Cc: Anton Blanchard <anton@samba.org> Cc: Paul Mackerras <paulus@samba.org> Cc: kvm-ppc@vger.kernel.org
| | * | | | powerpc/e6500: Make TLB lock recursiveScott Wood2014-03-201-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Once special level interrupts are supported, we may take nested TLB misses -- so allow the same thread to acquire the lock recursively. The lock will not be effective against the nested TLB miss handler trying to write the same entry as the interrupted TLB miss handler, but that's also a problem on non-threaded CPUs that lack TLB write conditional. This will be addressed in the patch that enables crit/mc support by invalidating the TLB on return from level exceptions. Signed-off-by: Scott Wood <scottwood@freescale.com>
| | * | | | powerpc/fsl: add PVR definition for E500MC and E5500Wang Dongsheng2014-03-201-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>
| * | | | | powerpc/book3s: Fix CFAR clobbering issue in machine check handler.Mahesh Salgaonkar2014-03-241-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While checking powersaving mode in machine check handler at 0x200, we clobber CFAR register. Fix it by saving and restoring it during beq/bgt. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | | powerpc/compat: 32-bit little endian machine name is ppcle, not ppcAnton Blanchard2014-03-241-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I noticed this when testing setarch. No, we don't magically support a big endian userspace on a little endian kernel. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: stable@vger.kernel.org # v3.10+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | | powerpc: Add hvcalls for 24x7 and gpci (Get Performance Counter Info)Cody P Schafer2014-03-231-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | | powerpc/perf: Enable BHRB access for EBB eventsMichael Ellerman2014-03-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous commit added constraint and register handling to allow processes using EBB (Event Based Branches) to request access to the BHRB (Branch History Rolling Buffer). With that in place we can allow processes using EBB to access the BHRB. This is achieved by setting BHRBA in MMCR0 when we enable EBB access. We must also clear BHRBA when we are disabling. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | | powerpc/perf: Add lost exception workaroundMichael Ellerman2014-03-231-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some power8 revisions have a hardware bug where we can lose a PMU exception, this commit adds a workaround to detect the bad condition and rectify the situation. See the comment in the commit for a full description. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | | powerpc: Add a cpu feature CPU_FTR_PMAO_BUGMichael Ellerman2014-03-231-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some power8 revisions have a hardware bug where we can lose a Performance Monitor (PMU) exception under certain circumstances. We will be adding a workaround for this case, see the next commit for details. The observed behaviour is that writing PMAO doesn't cause an exception as we would expect, hence the name of the feature. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | | powerpc/perf: Define perf_event_print_debug() to print PMU register valuesAnshuman Khandual2014-03-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the sysrq ShowRegs command does not print any PMU registers as we have an empty definition for perf_event_print_debug(). This patch defines perf_event_print_debug() to print various PMU registers. Example output: CPU: 0 PMU registers, ppmu = POWER7 n_counters = 6 PMC1: 00000000 PMC2: 00000000 PMC3: 00000000 PMC4: 00000000 PMC5: 00000000 PMC6: 00000000 PMC7: deadbeef PMC8: deadbeef MMCR0: 0000000080000000 MMCR1: 0000000000000000 MMCRA: 0f00000001000000 SIAR: 0000000000000000 SDAR: 0000000000000000 SIER: 0000000000000000 Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> [mpe: Fix 32 bit build and rework formatting for compactness] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | | powerpc/powernv: Enable fetching of platform sensor dataNeelesh Gupta2014-03-231-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables fetching of various platform sensor data through OPAL and expects a sensor handle from the driver to pass to OPAL. Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | | powerpc/powernv: Enable reading and updating of system parametersNeelesh Gupta2014-03-231-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables reading and updating of system parameters through OPAL call. Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | | powerpc/powernv: Infrastructure to support OPAL async completionNeelesh Gupta2014-03-231-1/+11
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for notifying the clients of their request completion. Clients request for the token before making OPAL call and then wait for the response. This patch uses messaging infrastructure to pull the data to linux by registering itself for the message type OPAL_MSG_ASYNC_COMP. Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | powerpc/powernv Platform dump interfaceStewart Smith2014-03-071-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This enables support for userspace to fetch and initiate FSP and Platform dumps from the service processor (via firmware) through sysfs. Based on original patch from Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Flow: - We register for OPAL notification events. - OPAL sends new dump available notification. - We make information on dump available via sysfs - Userspace requests dump contents - We retrieve the dump via OPAL interface - User copies the dump data - userspace sends ack for dump - We send ACK to OPAL. sysfs files: - We add the /sys/firmware/opal/dump directory - echoing 1 (well, anything, but in future we may support different dump types) to /sys/firmware/opal/dump/initiate_dump will initiate a dump. - Each dump that we've been notified of gets a directory in /sys/firmware/opal/dump/ with a name of the dump type and ID (in hex, as this is what's used elsewhere to identify the dump). - Each dump has files: id, type, dump and acknowledge dump is binary and is the dump itself. echoing 'ack' to acknowledge (currently any string will do) will acknowledge the dump and it will soon after disappear from sysfs. OPAL APIs: - opal_dump_init() - opal_dump_info() - opal_dump_read() - opal_dump_ack() - opal_dump_resend_notification() Currently we are only ever notified for one dump at a time (until the user explicitly acks the current dump, then we get a notification of the next dump), but this kernel code should "just work" when OPAL starts notifying us of all the dumps present. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | powerpc/powernv: Read OPAL error log and export it through sysfsStewart Smith2014-03-071-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on a patch by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> This patch adds support to read error logs from OPAL and export them to userspace through a sysfs interface. We export each log entry as a directory in /sys/firmware/opal/elog/ Currently, OPAL will buffer up to 128 error log records, we don't need to have any knowledge of this limit on the Linux side as that is actually largely transparent to us. Each error log entry has the following files: id, type, acknowledge, raw. Currently we just export the raw binary error log in the 'raw' attribute. In a future patch, we may parse more of the error log to make it a bit easier for userspace (e.g. to be able to display a brief summary in petitboot without having to have a full parser). If we have >128 logs from OPAL, we'll only be notified of 128 until userspace starts acknowledging them. This limitation may be lifted in the future and with this patch, that should "just work" from the linux side. A userspace daemon should: - wait for error log entries using normal mechanisms (we announce creation) - read error log entry - save error log entry safely to disk - acknowledge the error log entry - rinse, repeat. On the Linux side, we read the error log when we're notified of it. This possibly isn't ideal as it would be better to only read them on-demand. However, this doesn't really work with current OPAL interface, so we read the error log immediately when notified at the moment. I've tested this pretty extensively and am rather confident that the linux side of things works rather well. There is currently an issue with the service processor side of things for >128 error logs though. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | powerpc/pseries: Update dynamic cache nodes for suspend/resume operationHaren Myneni2014-03-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pHyp can change cache nodes for suspend/resume operation. Currently the device tree is updated by drmgr in userspace after all non boot CPUs are enabled. Hence, we do not modify the cache list based on the latest cache nodes. Also we do not remove cache entries for the primary CPU. This patch removes the cache list for the boot CPU, updates the device tree before enabling nonboot CPUs and adds cache list for the boot cpu. This patch also has the side effect that older versions of drmgr will perform a second device tree update from userspace. While this is a redundant waste of a couple cycles it is harmless since firmware returns the same data for the subsequent update-nodes/properties rtas calls. Signed-off-by: Haren Myneni <hbabu@us.ibm.com> Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | powerpc/pseries: Use remove_memory() to remove memoryNathan Fontenot2014-03-071-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The memory remove code for powerpc/pseries should call remove_memory() so that we are holding the hotplug_memory lock during memory remove operations. This patch updates the memory node remove handler to call remove_memory() and adds a ppc_md.remove_memory() entry to handle pseries specific work that is called from arch_remove_memory(). During memory remove in pseries_remove_memblock() we have to stay with removing memory one section at a time. This is needed because of how memory resources are handled. During memory add for pseries (via the probe file in sysfs) we add memory one section at a time which gives us a memory resource for each section. Future patches will aim to address this so will not have to remove memory one section at a time. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | powerpc/book3s: Recover from MC in sapphire on SCOM read via MMIO.Mahesh Salgaonkar2014-03-073-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Detect and recover from machine check when inside opal on a special scom load instructions. On specific SCOM read via MMIO we may get a machine check exception with SRR0 pointing inside opal. To recover from MC in this scenario, get a recovery instruction address and return to it from MC. OPAL will export the machine check recoverable ranges through device tree node mcheck-recoverable-ranges under ibm,opal: # hexdump /proc/device-tree/ibm,opal/mcheck-recoverable-ranges 0000000 0000 0000 3000 2804 0000 000c 0000 0000 0000010 3000 2814 0000 0000 3000 27f0 0000 000c 0000020 0000 0000 3000 2814 xxxx xxxx xxxx xxxx 0000030 llll llll yyyy yyyy yyyy yyyy ... ... # where: xxxx xxxx xxxx xxxx = Starting instruction address llll llll = Length of the address range. yyyy yyyy yyyy yyyy = recovery address Each recoverable address range entry is (start address, len, recovery address), 2 cells each for start and recovery address, 1 cell for len, totalling 5 cells per entry. During kernel boot time, build up the recovery table with the list of recovery ranges from device-tree node which will be used during machine check exception to recover from MMIO SCOM UE. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>