summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* pnfs-obj: Fix the comp_index != 0 caseBoaz Harrosh2011-08-042-9/+10
| | | | | | | | | | | | | | There were bugs in the case of partial layout where olo_comp_index is not zero. This used to work and was tested but one of the later cleanup SQUASHMEs broke it and was not tested since. Also add a dprint that specify those received layout parameters. Everything else was already printed. [Needed in v3.0] CC: Stable Tree <stable@kernel.org> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* pnfs-obj: Bug when we are running out of bioBoaz Harrosh2011-08-041-7/+5Star
| | | | | | | | | | | | | | | | | | | | | When we have a situation that the number of pages we want to encode is bigger then the size of the bio. (Which can currently happen only when all IO is going to a single device .e.g group_width==1) then the IO is submitted short and we report back only the amount of bytes we actually wrote/read and all is fine. BUT ... There was a bug that the current length counter was advanced before the fail to add the extra page, and we come to a situation that the CDB length was one-page longer then the actual bio size, which is of course rejected by the osd-target. While here also fix the bio size calculation, in the case that we received more then one group of devices. CC: Stable Tree <stable@kernel.org> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* nfs: add missing prefetch.h includeHeiko Carstens2011-08-041-0/+1
| | | | | | | | | | | | | | Fix this compile error on s390: CC [M] fs/nfs/blocklayout/blocklayout.o fs/nfs/blocklayout/blocklayout.c: In function 'bl_end_io_read': fs/nfs/blocklayout/blocklayout.c:201:4: error: implicit declaration of function 'prefetchw' Introduced with 9549ec01 "pnfsblock: bl_read_pagelist". Cc: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* Boot up with usermodehelper disabledLinus Torvalds2011-08-042-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The core device layer sends tons of uevent notifications for each device it finds, and if the kernel has been built with a non-empty CONFIG_UEVENT_HELPER_PATH that will make us try to execute the usermode helper binary for all these events very early in the boot. Not only won't the root filesystem even be mounted at that point, we literally won't have necessarily even initialized all the process handling data structures at that point, which causes no end of silly problems even when the usermode helper doesn't actually succeed in executing. So just use our existing infrastructure to disable the usermodehelpers to make the kernel start out with them disabled. We enable them when we've at least initialized stuff a bit. Problems related to an uninitialized init_ipc_ns.ids[IPC_SHM_IDS].rw_mutex reported by various people. Reported-by: Manuel Lauss <manuel.lauss@googlemail.com> Reported-by: Richard Weinberger <richard@nod.at> Reported-by: Marc Zyngier <maz@misterjones.org> Acked-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Vasiliy Kulikov <segoon@openwall.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* x86: don't include xen/xen.h in <asm/io.h> unless XEN is enabledLinus Torvalds2011-08-041-2/+1Star
| | | | | | | | | | | | | | | | Dmitry Kasatkin reports: "kernel-devel package with kernel headers have no <include/xen> directory if XEN is disabled. Modules which inclide asm/io.h won't compile. XEN related content is behind the CONFIG_XEN flag in the io.h. And <xen/xen.h> should be also behind CONFIG_XEN flag." So move the include of <xen/xen.h> down into the section that is conditional on CONFIG_XEN. Reported-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for-linus' of ↵Linus Torvalds2011-08-049-7/+36
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: ad7879 - fix deficient device disable Input: gpio_keys - fix two typos in devicetree documentation Input: mma8450 - add device tree probe support Input: gpio_keys - return proper error code if memory allocation fails Input: lm8323 - add missing device_remove_file for dev_attr_time Input: tegra-kbc - fix computation of polling time Input: kxtj9 - explicitly include module.h Input: psmouse - hgpk.c needs module.h
| * Input: ad7879 - fix deficient device disableMichael Hennerich2011-08-031-1/+3
| | | | | | | | | | | | | | | | | | | | Input close or device disable should not interact with the exported gpiolib functionality. However that's the case. __ad7879_disable() clears the entire AD7879_REG_CTRL2, while it should just power down the ADC and its reference. Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
| * Input: gpio_keys - fix two typos in devicetree documentationTobias Klauser2011-08-031-1/+1
| | | | | | | | | | Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
| * Input: mma8450 - add device tree probe supportShawn Guo2011-08-012-0/+19
| | | | | | | | | | | | | | | | | | It adds device tree probe support for mma8450 driver. Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Acked-by: Eric Miao <eric.miao@linaro.org> Acked-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
| * Input: gpio_keys - return proper error code if memory allocation failsTobias Klauser2011-07-301-1/+1
| | | | | | | | | | | | | | Return -ENOMEM if kzalloc fails in gpio_keys_get_devtree_pdata(). Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
| * Input: lm8323 - add missing device_remove_file for dev_attr_timeAxel Lin2011-07-301-2/+7
| | | | | | | | | | | | | | | | | | Add missing device_remove_file() for dev_attr_time in lm8323_remove(). Also calling device_remove_file() in lm8323_probe() error path to remove sysfs attribute file. Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
| * Input: tegra-kbc - fix computation of polling timeRakesh Iyer2011-07-301-2/+3
| | | | | | | | | | | | | | | | | | | | Fix a constant definition and computation of polling time. [dtor@mail.ru: switched to using DIV_ROUND_UP as was suggested by Thierry Reding <thierry.reding@avionic-design.de>] Signed-off-by: Rakesh Iyer <riyer@nvidia.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
| * Input: kxtj9 - explicitly include module.hStephen Rothwell2011-07-301-0/+1
| | | | | | | | | | | | | | | | We need to explicitly include module.h since some of its facilities are used. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
| * Input: psmouse - hgpk.c needs module.hRandy Dunlap2011-07-301-0/+1
| | | | | | | | | | | | | | | | hgpk.c uses interfaces from linux/module.h, so it should include that file. This fixes build errors. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
* | Merge branch 'idle-release' of ↵Linus Torvalds2011-08-0418-50/+1141
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6 * 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6: cpuidle: stop depending on pm_idle x86 idle: move mwait_idle_with_hints() to where it is used cpuidle: replace xen access to x86 pm_idle and default_idle cpuidle: create bootparam "cpuidle.off=1" mrst_pmu: driver for Intel Moorestown Power Management Unit
| * | cpuidle: stop depending on pm_idleLen Brown2011-08-046-25/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cpuidle users should call cpuidle_call_idle() directly rather than via (pm_idle)() function pointer. Architecture may choose to continue using (pm_idle)(), but cpuidle need not depend on it: my_arch_cpu_idle() ... if(cpuidle_call_idle()) pm_idle(); cc: Kevin Hilman <khilman@deeprootsystems.com> cc: Paul Mundt <lethal@linux-sh.org> cc: x86@kernel.org Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
| * | x86 idle: move mwait_idle_with_hints() to where it is usedLen Brown2011-08-043-25/+23Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ...and make it static no functional change cc: x86@kernel.org Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
| * | cpuidle: replace xen access to x86 pm_idle and default_idleLen Brown2011-08-043-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a Xen Dom0 kernel boots on a hypervisor, it gets access to the raw-hardware ACPI tables. While it parses the idle tables for the hypervisor's beneift, it uses HLT for its own idle. Rather than have xen scribble on pm_idle and access default_idle, have it simply disable_cpuidle() so acpi_idle will not load and architecture default HLT will be used. cc: xen-devel@lists.xensource.com Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
| * | cpuidle: create bootparam "cpuidle.off=1"Len Brown2011-08-045-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | useful for disabling cpuidle to fall back to architecture-default idle loop cpuidle drivers and governors will fail to register. on x86 they'll say so: intel_idle: intel_idle yielding to (null) ACPI: acpi_idle yielding to (null) Signed-off-by: Len Brown <len.brown@intel.com>
| * | mrst_pmu: driver for Intel Moorestown Power Management UnitLen Brown2011-08-044-0/+1058
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Moorestown (MRST) Power Management Unit (PMU) driver directs the SOC power states in the "Langwell" south complex (SCU). It hooks pci_platform_pm_ops[] and thus observes all PCI ".set_state" requests. For devices in the SC, the pmu driver translates those PCI requests into the appropriate commands for the SCU. The PMU driver helps implement S0i3, a deep system idle power idle state. Entry into S0i3 is via cpuidle, just like regular processor c-states. S0i3 depends on pre-conditions including uni-processor, graphics off, and certain IO devices in the SC must be off. If those pre-conditions are met, then the PMU allows cpuidle to enter S0i3, otherwise such requests are demoted, either to Atom C4 or Atom C6. This driver is based on prototype work by Bruce Flemming, Illyas Mansoor, Rajeev D. Muralidhar, Vishwesh M. Rudramuni, Hari Seshadri and Sujith Thomas. The current driver also includes contributions from H. Peter Anvin, Arjan van de Ven, Kristen Accardi, and Yong Wang. Thanks for additional review feedback from Alan Cox and Randy Dunlap. Acked-by: Alan Cox <alan@linux.intel.com> Acked-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
* | | Merge branch 'apei-release' of ↵Linus Torvalds2011-08-0435-135/+1172
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'apei-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: ACPI, APEI, EINJ Param support is disabled by default APEI GHES: 32-bit buildfix ACPI: APEI build fix ACPI, APEI, GHES: Add hardware memory error recovery support HWPoison: add memory_failure_queue() ACPI, APEI, GHES, Error records content based throttle ACPI, APEI, GHES, printk support for recoverable error via NMI lib, Make gen_pool memory allocator lockless lib, Add lock-less NULL terminated single list Add Kconfig option ARCH_HAVE_NMI_SAFE_CMPXCHG ACPI, APEI, Add WHEA _OSC support ACPI, APEI, Add APEI bit support in generic _OSC call ACPI, APEI, GHES, Support disable GHES at boot time ACPI, APEI, GHES, Prevent GHES to be built as module ACPI, APEI, Use apei_exec_run_optional in APEI EINJ and ERST ACPI, APEI, Add apei_exec_run_optional ACPI, APEI, GHES, Do not ratelimit fatal error printk before panic ACPI, APEI, ERST, Fix erst-dbg long record reading issue ACPI, APEI, ERST, Prevent erst_dbg from loading if ERST is disabled
| * \ \ Merge branch 'apei' into apei-releaseLen Brown2011-08-0335-135/+1172
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some trivial conflicts due to other various merges adding to the end of common lists sooner than this one. arch/ia64/Kconfig arch/powerpc/Kconfig arch/x86/Kconfig lib/Kconfig lib/Makefile Signed-off-by: Len Brown <len.brown@intel.com>
| | * | | ACPI, APEI, EINJ Param support is disabled by defaultHuang Ying2011-08-032-17/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | EINJ parameter support is only usable for some specific BIOS. Originally, it is expected to have no harm for BIOS does not support it. But now, we found it will cause issue (memory overwriting) for some BIOS. So param support is disabled by default and only enabled when newly added module parameter named "param_extension" is explicitly specified. Signed-off-by: Huang Ying <ying.huang@intel.com> Cc: Matthew Garrett <mjg@redhat.com> Acked-by: Don Zickus <dzickus@redhat.com> Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
| | * | | APEI GHES: 32-bit buildfixLen Brown2011-08-031-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | drivers/acpi/apei/ghes.c:542: warning: integer overflow in expression drivers/acpi/apei/ghes.c:619: warning: integer overflow in expression ghes.c:(.text+0x46289): undefined reference to `__udivdi3'   in function ghes_estatus_cache_add(). Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Len Brown <len.brown@intel.com>
| | * | | ACPI: APEI build fixLen Brown2011-08-032-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | as GHES is optional... When # CONFIG_ACPI_APEI_GHES is not set: (.init.text+0x4c22): undefined reference to `ghes_disable' Reported-by: Randy Dunlap <rdunlap@xenotime.net> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Len Brown <len.brown@intel.com>
| | * | | ACPI, APEI, GHES: Add hardware memory error recovery supportHuang Ying2011-08-032-7/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | memory_failure_queue() is called when recoverable memory errors are notified by firmware to do the recovery work. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
| | * | | HWPoison: add memory_failure_queue()Huang Ying2011-08-032-0/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | memory_failure() is the entry point for HWPoison memory error recovery. It must be called in process context. But commonly hardware memory errors are notified via MCE or NMI, so some delayed execution mechanism must be used. In MCE handler, a work queue + ring buffer mechanism is used. In addition to MCE, now APEI (ACPI Platform Error Interface) GHES (Generic Hardware Error Source) can be used to report memory errors too. To add support to APEI GHES memory recovery, a mechanism similar to that of MCE is implemented. memory_failure_queue() is the new entry point that can be called in IRQ context. The next step is to make MCE handler uses this interface too. Signed-off-by: Huang Ying <ying.huang@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com>
| | * | | ACPI, APEI, GHES, Error records content based throttleHuang Ying2011-08-031-7/+177
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | printk is used by GHES to report hardware errors. Ratelimit is enforced on the printk to avoid too many hardware error reports in kernel log. Because there may be thousands or even millions of corrected hardware errors during system running. Currently, a simple scheme is used. That is, the total number of hardware error reporting is ratelimited. This may cause some issues in practice. For example, there are two kinds of hardware errors occurred in system. One is corrected memory error, because the fault memory address is accessed frequently, there may be hundreds error report per-second. The other is corrected PCIe AER error, it will be reported once per-second. Because they share one ratelimit control structure, it is highly possible that only memory error is reported. To avoid the above issue, an error record content based throttle algorithm is implemented in the patch. Where after the first successful reporting, all error records that are same are throttled for some time, to let other kinds of error records have the opportunity to be reported. In above example, the memory errors will be throttled for some time, after being printked. Then the PCIe AER error will be printked successfully. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
| | * | | ACPI, APEI, GHES, printk support for recoverable error via NMIHuang Ying2011-08-032-18/+193
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some APEI GHES recoverable errors are reported via NMI, but printk is not safe in NMI context. To solve the issue, a lock-less memory allocator is used to allocate memory in NMI handler, save the error record into the allocated memory, put the error record into a lock-less list. On the other hand, an irq_work is used to delay the operation from NMI context to IRQ context. The irq_work IRQ handler will remove nodes from lock-less list, printk the error record and do some further processing include recovery operation, then free the memory. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
| | * | | lib, Make gen_pool memory allocator locklessHuang Ying2011-08-034-65/+272
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This version of the gen_pool memory allocator supports lockless operation. This makes it safe to use in NMI handlers and other special unblockable contexts that could otherwise deadlock on locks. This is implemented by using atomic operations and retries on any conflicts. The disadvantage is that there may be livelocks in extreme cases. For better scalability, one gen_pool allocator can be used for each CPU. The lockless operation only works if there is enough memory available. If new memory is added to the pool a lock has to be still taken. So any user relying on locklessness has to ensure that sufficient memory is preallocated. The basic atomic operation of this allocator is cmpxchg on long. On architectures that don't have NMI-safe cmpxchg implementation, the allocator can NOT be used in NMI handler. So code uses the allocator in NMI handler should depend on CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG. Signed-off-by: Huang Ying <ying.huang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com>
| | * | | lib, Add lock-less NULL terminated single listHuang Ying2011-08-034-0/+260
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cmpxchg is used to implement adding new entry to the list, deleting all entries from the list, deleting first entry of the list and some other operations. Because this is a single list, so the tail can not be accessed in O(1). If there are multiple producers and multiple consumers, llist_add can be used in producers and llist_del_all can be used in consumers. They can work simultaneously without lock. But llist_del_first can not be used here. Because llist_del_first depends on list->first->next does not changed if list->first is not changed during its operation, but llist_del_first, llist_add, llist_add (or llist_del_all, llist_add, llist_add) sequence in another consumer may violate that. If there are multiple producers and one consumer, llist_add can be used in producers and llist_del_all or llist_del_first can be used in the consumer. This can be summarized as follow: | add | del_first | del_all add | - | - | - del_first | | L | L del_all | | | - Where "-" stands for no lock is needed, while "L" stands for lock is needed. The list entries deleted via llist_del_all can be traversed with traversing function such as llist_for_each etc. But the list entries can not be traversed safely before deleted from the list. The order of deleted entries is from the newest to the oldest added one. If you want to traverse from the oldest to the newest, you must reverse the order by yourself before traversing. The basic atomic operation of this list is cmpxchg on long. On architectures that don't have NMI-safe cmpxchg implementation, the list can NOT be used in NMI handler. So code uses the list in NMI handler should depend on CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG. Signed-off-by: Huang Ying <ying.huang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com>
| | * | | Add Kconfig option ARCH_HAVE_NMI_SAFE_CMPXCHGHuang Ying2011-08-0313-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cmpxchg() is widely used by lockless code, including NMI-safe lockless code. But on some architectures, the cmpxchg() implementation is not NMI-safe, on these architectures the lockless code may need a spin_trylock_irqsave() based implementation. This patch adds a Kconfig option: ARCH_HAVE_NMI_SAFE_CMPXCHG, so that NMI-safe lockless code can depend on it or provide different implementation according to it. On many architectures, cmpxchg is only NMI-safe for several specific operand sizes. So, ARCH_HAVE_NMI_SAFE_CMPXCHG define in this patch only guarantees cmpxchg is NMI-safe for sizeof(unsigned long). Signed-off-by: Huang Ying <ying.huang@intel.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Acked-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Acked-by: Richard Henderson <rth@twiddle.net> CC: Mikael Starvik <starvik@axis.com> Acked-by: David Howells <dhowells@redhat.com> CC: Yoshinori Sato <ysato@users.sourceforge.jp> CC: Tony Luck <tony.luck@intel.com> CC: Hirokazu Takata <takata@linux-m32r.org> CC: Geert Uytterhoeven <geert@linux-m68k.org> CC: Michal Simek <monstr@monstr.eu> Acked-by: Ralf Baechle <ralf@linux-mips.org> CC: Kyle McMartin <kyle@mcmartin.ca> CC: Martin Schwidefsky <schwidefsky@de.ibm.com> CC: Chen Liqin <liqin.chen@sunplusct.com> CC: "David S. Miller" <davem@davemloft.net> CC: Ingo Molnar <mingo@redhat.com> CC: Chris Zankel <chris@zankel.net> Signed-off-by: Len Brown <len.brown@intel.com>
| | * | | ACPI, APEI, Add WHEA _OSC supportHuang Ying2011-07-143-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | APEI firmware first mode must be turned on explicitly on some machines, otherwise there may be no GHES hardware error record for hardware error notification. APEI bit in generic _OSC call can be used to do that, but on some machine, a special WHEA _OSC call must be used. This patch adds the support to that WHEA _OSC call. Signed-off-by: Huang Ying <ying.huang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
| | * | | ACPI, APEI, Add APEI bit support in generic _OSC callHuang Ying2011-07-142-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In APEI firmware first mode, hardware error is reported by hardware to firmware firstly, then firmware reports the error to Linux in a GHES error record via POLL/SCI/IRQ/NMI etc. This may result in some issues if OS has no full APEI support. So some firmware implementation will work in a back-compatible mode by default. Where firmware will only notify OS in old-fashion, without GHES record. For example, for a fatal hardware error, only NMI is signaled, no GHES record. To gain full APEI power on these machines, APEI bit in generic _OSC call can be specified to tell firmware that Linux has full APEI support. This patch adds the APEI bit support in generic _OSC call. Signed-off-by: Huang Ying <ying.huang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
| | * | | ACPI, APEI, GHES, Support disable GHES at boot timeHuang Ying2011-07-143-8/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some machine may have broken firmware so that GHES and firmware first mode should be disabled. This patch adds support to that. Signed-off-by: Huang Ying <ying.huang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
| | * | | ACPI, APEI, GHES, Prevent GHES to be built as moduleHuang Ying2011-07-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GHES (Generic Hardware Error Source) is used to process hardware error notification in firmware first mode. But because firmware first mode can be turned on but can not be turned off, it is unreasonable to unload the GHES module with firmware first mode turned on. To avoid confusion, this patch makes GHES can be enabled/disabled in configuration time, but not built as module and unloaded at run time. Signed-off-by: Huang Ying <ying.huang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
| | * | | ACPI, APEI, Use apei_exec_run_optional in APEI EINJ and ERSTHuang Ying2011-07-142-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes APEI EINJ and ERST to use apei_exec_run for mandatory actions, and apei_exec_run_optional for optional actions. Cc: Thomas Renninger <trenn@novell.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
| | * | | ACPI, APEI, Add apei_exec_run_optionalHuang Ying2011-07-142-5/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some actions in APEI ERST and EINJ tables are optional, for example, ACPI_EINJ_BEGIN_OPERATION action is used to do some preparation for error injection, and firmware may choose to do nothing here. While some other actions are mandatory, for example, firmware must provide ACPI_EINJ_GET_ERROR_TYPE implementation. Original implementation treats all actions as optional (that is, can have no instructions), that may cause issue if firmware does not provide some mandatory actions. To fix this, this patch adds apei_exec_run_optional, which should be used for optional actions. The original apei_exec_run should be used for mandatory actions. Cc: Thomas Renninger <trenn@novell.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
| | * | | ACPI, APEI, GHES, Do not ratelimit fatal error printk before panicHuang Ying2011-07-141-11/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | printk is used by GHES to report hardware errors. Normally, the printk will be ratelimited to avoid too many hardware error reports in kernel log. Because there may be thousands or even millions of corrected hardware errors during system running. That is different for fatal hardware error, because system will go panic as soon as possible, there will be no more than several error records. And these error records are valuable for system fault diagnosis, so they should not be ratelimited. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
| | * | | ACPI, APEI, ERST, Fix erst-dbg long record reading issueChen Gong2011-07-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we debug ERST table with erst-dbg, if the error record in ERST table is too long(>4K), it can't be read out. So this patch increases the buffer size to 16K to ensure such error records can be read from ERST table. Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
| | * | | ACPI, APEI, ERST, Prevent erst_dbg from loading if ERST is disabledHuang Ying2011-07-141-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | erst_dbg module can not work when ERST is disabled. So disable module loading to provide clearer information to user. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
* | | | | Merge branch 'for-next' of ↵Linus Torvalds2011-08-049-59/+122
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: tcm_fc: Handle DDP/SW fc_frame_payload_get failures in ft_recv_write_data target: Fix bug for transport_generic_wait_for_tasks with direct operation target: iscsi_target depends on NET target: Fix WRITE_SAME_16 lba assignment breakage MAINTAINERS: Add target-devel list for drivers/target/ iscsi-target: Fix CONFIG_SMP=n and CONFIG_MODULES=n build failure iscsi-target: Fix snprintf usage with MAX_PORTAL_LEN iscsi-target: Fix uninitialized usage of cmd->pad_bytes iscsi-target: strlen() doesn't count the terminator iscsi-target: Fix NULL dereference on allocation failure
| * | | | | tcm_fc: Handle DDP/SW fc_frame_payload_get failures in ft_recv_write_dataKiran Patil2011-08-033-49/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: HW DDP context was not invalidated in case of ABORTS, etc... This leads to the problem where memory pages which are used for DDP as user descriptor could get reused for some other purpose (such as to satisfy new memory allocation request either by kernel or user mode threads) and since HW DDP context was not invalidated, HW continue to write to those pages, hence causing memory corruption. Fix: Either on incoming ABORTS or due to exchange time out, allowed the target to cleanup HW DDP context if it was setup for respective ft_cmd. Added new function to perform this cleanup, furthur it can be enhanced for other cleanup activity. Fix ft_recv_write_data() to properly handle fc_frame_payload_get to return pointer to payload if it exist. If there is no payload which is most common case (+ve case in case if DDP is working as expected, it will return NULL. Yes, scope of buf is limited to printk. Invalidation of HW context (which is done inside ft_invl_hw_context() is necessary in SUCCESS and FAILURE case of DDP. Hence invalidation is DONE as long as there was DDP setup (whether it worked correctly or not, NOTE: For some reason, if there is any error w.r.t DDP such as out of order packet reception, HW simply post the full packet in rx queue. Signed-off-by: Kiran Patil <kiran.patil@intel.com> Cc: Robert W Love <robert.w.love@intel.com> Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
| * | | | | target: Fix bug for transport_generic_wait_for_tasks with direct operationNicholas Bellinger2011-07-301-2/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a bug in transport_handle_cdb_direct() usage with target_core where transport_generic_wait_for_tasks() was bypassing active I/O + usage of cmd->t_transport_stop_comp because cmd->t_transport_active=1 was not being set before dispatching with transport_generic_new_cmd(). The fix follows existing usage in transport_generic_handle_cdb*() -> transport_add_cmd_to_queue() and set these directly, as well as handle transport_generic_new_cmd() exceptions for QUEUE_FULL and CHECK_CONDITION instead of propigating up to RX context fabric code. The bug was manifesting itself with the following SLUB poison overwritten warnings with iscsi-target v4.1 LUNs using the new process context direct operation during session reinstatement with active I/O exception handling: [885410.498267] ============================================================================= [885410.621622] BUG lio_cmd_cache: Poison overwritten [885410.621791] ----------------------------------------------------------------------------- [885410.621792] [885410.623420] INFO: 0xffff880000cf3750-0xffff880000cf378d. First byte 0x6a instead of 0x6b [885410.626332] INFO: Allocated in iscsit_allocate_cmd+0x1c/0xd4 [iscsi_target_mod] age=345 cpu=1 pid=22554 [885411.855189] INFO: Freed in iscsit_release_cmd+0x208/0x217 [iscsi_target_mod] age=1410 cpu=1 pid=22554 [885411.856048] INFO: Slab 0xffffea000002d480 objects=22 used=0 fp=0xffff880000cf7300 flags=0x4080 [885411.856368] INFO: Object 0xffff880000cf33c0 @offset=13248 fp=0xffff880000cf6780 <SNIP> [885411.955678] Pid: 22554, comm: iscsi_trx Not tainted 3.0.0-rc7+ #30 [885411.956040] Call Trace: [885411.957029] [<ffffffff810e5cf9>] print_trailer+0x12e/0x137 [885412.752879] [<ffffffff810e61d9>] check_bytes_and_report+0xb9/0xfd [885412.754933] [<ffffffff810e62d2>] check_object+0xb5/0x192 [885412.755099] [<ffffffff810e6445>] __free_slab+0x96/0x13a [885412.757008] [<ffffffff810e652a>] discard_slab+0x41/0x43 [885412.758171] [<ffffffff810e7a4c>] __slab_free+0xf3/0xfe [885412.761027] [<ffffffffa030a536>] ? iscsit_release_cmd+0x208/0x217 [iscsi_target_mod] [885412.761354] [<ffffffff810e7e95>] kmem_cache_free+0x6f/0xac [885412.761536] [<ffffffffa030a536>] iscsit_release_cmd+0x208/0x217 [iscsi_target_mod] [885412.762056] [<ffffffffa020e467>] ? iblock_free_task+0x34/0x39 [target_core_iblock] [885412.762368] [<ffffffffa0314131>] lio_release_cmd+0x10/0x12 [iscsi_target_mod] [885412.764129] [<ffffffffa02c2254>] transport_release_cmd+0x2f/0x33 [target_core_mod] [885412.805024] [<ffffffffa02c230e>] transport_generic_remove+0xb6/0xc3 [target_core_mod] [885412.806424] [<ffffffff81035b5f>] ? try_to_wake_up+0x1bd/0x1bd [885412.809033] [<ffffffffa02c241f>] transport_generic_free_cmd+0x75/0x7d [target_core_mod] [885412.810066] [<ffffffffa02c2643>] transport_generic_wait_for_tasks+0x21c/0x22b [target_core_mod] [885412.811056] [<ffffffff8139f0b1>] ? mutex_lock+0x11/0x32 [885412.813059] [<ffffffff8139f0b1>] ? mutex_lock+0x11/0x32 [885412.813200] [<ffffffffa030b81d>] iscsit_close_connection+0x1d5/0x63a [iscsi_target_mod] [885412.813517] [<ffffffffa0300a82>] iscsit_take_action_for_connection_exit+0xdb/0xe0 [iscsi_target_mod] [885412.813851] [<ffffffffa03111e9>] iscsi_target_rx_thread+0x11f6/0x1221 [iscsi_target_mod] [885412.829024] [<ffffffff81033e8d>] ? pick_next_task_fair+0xbe/0x10e [885412.831010] [<ffffffffa030fff3>] ? iscsit_handle_scsi_cmd+0x91d/0x91d [iscsi_target_mod] [885412.833011] [<ffffffffa030fff3>] ? iscsit_handle_scsi_cmd+0x91d/0x91d [iscsi_target_mod] [885412.835010] [<ffffffff8105388a>] kthread+0x7d/0x85 [885412.837022] [<ffffffff813a7124>] kernel_thread_helper+0x4/0x10 [885412.838008] [<ffffffff8105380d>] ? kthread_worker_fn+0x145/0x145 [885412.840047] [<ffffffff813a7120>] ? gs_change+0x13/0x13 [885412.842007] FIX lio_cmd_cache: Restoring 0xffff880000cf3750-0xffff880000cf378d=0x6 Cc: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | | | target: iscsi_target depends on NETRandy Dunlap2011-07-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iscsi target code uses lots on network interface functions, so it should depend on NET. Fixes many build errors when NET is not enabled: ERROR: "kernel_sendmsg" [drivers/target/iscsi/iscsi_target_mod.ko] undefined! ERROR: "in_aton" [drivers/target/iscsi/iscsi_target_mod.ko] undefined! ERROR: "sock_release" [drivers/target/iscsi/iscsi_target_mod.ko] undefined! ERROR: "kernel_listen" [drivers/target/iscsi/iscsi_target_mod.ko] undefined! ERROR: "kernel_setsockopt" [drivers/target/iscsi/iscsi_target_mod.ko] undefined! ERROR: "kernel_recvmsg" [drivers/target/iscsi/iscsi_target_mod.ko] undefined! ERROR: "kernel_accept" [drivers/target/iscsi/iscsi_target_mod.ko] undefined! ERROR: "sock_create" [drivers/target/iscsi/iscsi_target_mod.ko] undefined! ERROR: "kernel_bind" [drivers/target/iscsi/iscsi_target_mod.ko] undefined! ERROR: "in6_pton" [drivers/target/iscsi/iscsi_target_mod.ko] undefined! Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | | | target: Fix WRITE_SAME_16 lba assignment breakageNicholas Bellinger2011-07-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a bug in WRITE_SAME_16 LBA assignment where get_unaligned_be16() is incorrectly being used instead of get_unaligned_be64() for a 64-bit LBA. This was introduced with: commit a1d8b49abd60ba5d09e7c968731abcb0f8f1cbf6 Author: Andy Grover <agrover@redhat.com> Date: Mon May 2 17:12:10 2011 -0700 target: Updates from AGrover and HCH (round 3) (target: inline struct se_transport_task into struct se_cmd) Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | | | MAINTAINERS: Add target-devel list for drivers/target/Nicholas Bellinger2011-07-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | | | iscsi-target: Fix CONFIG_SMP=n and CONFIG_MODULES=n build failureNicholas Bellinger2011-07-271-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the following CONFIG_SMP=n and CONFIG_MODULES=n build failure, because iscsit_thread_get_cpumask() is defined as a macro in iscsi_target.c, but needed by iscsi_target_login.c drivers/built-in.o: In function `iscsi_post_login_handler': iscsi_target_login.c:(.text+0x13a315): undefined reference to `iscsit_thread_get_cpumask' iscsi_target_login.c:(.text+0x13a4b4): undefined reference to `iscsit_thread_get_cpumask' make: *** [.tmp_vmlinux1] Error 1 Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | | | iscsi-target: Fix snprintf usage with MAX_PORTAL_LENNicholas Bellinger2011-07-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes lio_target_call_addnptotpg() use sprintf() with MAX_PORTAL_LEN + 1 to address the following smatch warning: drivers/target/iscsi/iscsi_target_configfs.c +184 lio_target_call_addnptotpg(21) error: snprintf() chops off the last chars of 'name': 257 vs 256 Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | | | iscsi-target: Fix uninitialized usage of cmd->pad_bytesNicholas Bellinger2011-07-271-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes an uninitialized usage of cmd->pad_bytes inside of iscsit_handle_text_cmd() introduced during a v4.1 change to use cmd members instead of local pad_bytes variables. Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>