summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* workqueue: drop @bind from create_worker()Tejun Heo2012-07-171-19/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, create_worker()'s callers are responsible for deciding whether the newly created worker should be bound to the associated CPU and create_worker() sets WORKER_UNBOUND only for the workers for the unbound global_cwq. Creation during normal operation is always via maybe_create_worker() and @bind is true. For workers created during hotplug, @bind is false. Normal operation path is planned to be used even while the CPU is going through hotplug operations or offline and this static decision won't work. Drop @bind from create_worker() and decide whether to bind by looking at GCWQ_DISASSOCIATED. create_worker() will also set WORKER_UNBOUND autmatically if disassociated. To avoid flipping GCWQ_DISASSOCIATED while create_worker() is in progress, the flag is now allowed to be changed only while holding all manager_mutexes on the global_cwq. This requires that GCWQ_DISASSOCIATED is not cleared behind trustee's back. CPU_ONLINE no longer clears DISASSOCIATED before flushing trustee, which clears DISASSOCIATED before rebinding remaining workers if asked to release. For cases where trustee isn't around, CPU_ONLINE clears DISASSOCIATED after flushing trustee. Also, now, first_idle has UNBOUND set on creation which is explicitly cleared by CPU_ONLINE while binding it. These convolutions will soon be removed by further simplification of CPU hotplug path. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
* workqueue: use mutex for global_cwq manager exclusionTejun Heo2012-07-171-39/+26Star
| | | | | | | | | | | | | | | | | | | | | POOL_MANAGING_WORKERS is used to ensure that at most one worker takes the manager role at any given time on a given global_cwq. Trustee later hitched on it to assume manager adding blocking wait for the bit. As trustee already needed a custom wait mechanism, waiting for MANAGING_WORKERS was rolled into the same mechanism. Trustee is scheduled to be removed. This patch separates out MANAGING_WORKERS wait into per-pool mutex. Workers use mutex_trylock() to test for manager role and trustee uses mutex_lock() to claim manager roles. gcwq_claim/release_management() helpers are added to grab and release manager roles of all pools on a global_cwq. gcwq_claim_management() always grabs pool manager mutexes in ascending pool index order and uses pool index as lockdep subclass. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
* workqueue: ROGUE workers are UNBOUND workersTejun Heo2012-07-171-25/+21Star
| | | | | | | | | | | | | | | | Currently, WORKER_UNBOUND is used to mark workers for the unbound global_cwq and WORKER_ROGUE is used to mark workers for disassociated per-cpu global_cwqs. Both are used to make the marked worker skip concurrency management and the only place they make any difference is in worker_enter_idle() where WORKER_ROGUE is used to skip scheduling idle timer, which can easily be replaced with trustee state testing. This patch replaces WORKER_ROGUE with WORKER_UNBOUND and drops WORKER_ROGUE. This is to prepare for removing trustee and handling disassociated global_cwqs as unbound. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
* workqueue: drop CPU_DYING notifier operationTejun Heo2012-07-171-16/+13Star
| | | | | | | | | | | | | | | Workqueue used CPU_DYING notification to mark GCWQ_DISASSOCIATED. This was necessary because workqueue's CPU_DOWN_PREPARE happened before other DOWN_PREPARE notifiers and workqueue needed to stay associated across the rest of DOWN_PREPARE. After the previous patch, workqueue's DOWN_PREPARE happens after others and can set GCWQ_DISASSOCIATED directly. Drop CPU_DYING and let the trustee set GCWQ_DISASSOCIATED after disabling concurrency management. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
* workqueue: perform cpu down operations from low priority cpu_notifier()Tejun Heo2012-07-172-3/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, all workqueue cpu hotplug operations run off CPU_PRI_WORKQUEUE which is higher than normal notifiers. This is to ensure that workqueue is up and running while bringing up a CPU before other notifiers try to use workqueue on the CPU. Per-cpu workqueues are supposed to remain working and bound to the CPU for normal CPU_DOWN_PREPARE notifiers. This holds mostly true even with workqueue offlining running with higher priority because workqueue CPU_DOWN_PREPARE only creates a bound trustee thread which runs the per-cpu workqueue without concurrency management without explicitly detaching the existing workers. However, if the trustee needs to create new workers, it creates unbound workers which may wander off to other CPUs while CPU_DOWN_PREPARE notifiers are in progress. Furthermore, if the CPU down is cancelled, the per-CPU workqueue may end up with workers which aren't bound to the CPU. While reliably reproducible with a convoluted artificial test-case involving scheduling and flushing CPU burning work items from CPU down notifiers, this isn't very likely to happen in the wild, and, even when it happens, the effects are likely to be hidden by the following successful CPU down. Fix it by using different priorities for up and down notifiers - high priority for up operations and low priority for down operations. Workqueue cpu hotplug operations will soon go through further cleanup. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
* workqueue: reimplement WQ_HIGHPRI using a separate worker_poolTejun Heo2012-07-142-138/+65Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | WQ_HIGHPRI was implemented by queueing highpri work items at the head of the global worklist. Other than queueing at the head, they weren't handled differently; unfortunately, this could lead to execution latency of a few seconds on heavily loaded systems. Now that workqueue code has been updated to deal with multiple worker_pools per global_cwq, this patch reimplements WQ_HIGHPRI using a separate worker_pool. NR_WORKER_POOLS is bumped to two and gcwq->pools[0] is used for normal pri work items and ->pools[1] for highpri. Highpri workers get -20 nice level and has 'H' suffix in their names. Note that this change increases the number of kworkers per cpu. POOL_HIGHPRI_PENDING, pool_determine_ins_pos() and highpri chain wakeup code in process_one_work() are no longer used and removed. This allows proper prioritization of highpri work items and removes high execution latency of highpri work items. v2: nr_running indexing bug in get_pool_nr_running() fixed. v3: Refreshed for the get_pool_nr_running() update in the previous patch. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Josh Hunt <joshhunt00@gmail.com> LKML-Reference: <CAKA=qzaHqwZ8eqpLNFjxnO2fX-tgAOjmpvxgBFjv6dJeQaOW1w@mail.gmail.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fengguang Wu <fengguang.wu@intel.com>
* workqueue: introduce NR_WORKER_POOLS and for_each_worker_pool()Tejun Heo2012-07-141-70/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | Introduce NR_WORKER_POOLS and for_each_worker_pool() and convert code paths which need to manipulate all pools in a gcwq to use them. NR_WORKER_POOLS is currently one and for_each_worker_pool() iterates over only @gcwq->pool. Note that nr_running is per-pool property and converted to an array with NR_WORKER_POOLS elements and renamed to pool_nr_running. Note that get_pool_nr_running() currently assumes 0 index. The next patch will make use of non-zero index. The changes in this patch are mechanical and don't caues any functional difference. This is to prepare for multiple pools per gcwq. v2: nr_running indexing bug in get_pool_nr_running() fixed. v3: Pointer to array is stupid. Don't use it in get_pool_nr_running() as suggested by Linus. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>
* workqueue: separate out worker_pool flagsTejun Heo2012-07-121-22/+25
| | | | | | | | | | | | GCWQ_MANAGE_WORKERS, GCWQ_MANAGING_WORKERS and GCWQ_HIGHPRI_PENDING are per-pool properties. Add worker_pool->flags and make the above three flags per-pool flags. The changes in this patch are mechanical and don't caues any functional difference. This is to prepare for multiple pools per gcwq. Signed-off-by: Tejun Heo <tj@kernel.org>
* workqueue: use @pool instead of @gcwq or @cpu where applicableTejun Heo2012-07-121-107/+111
| | | | | | | | | | | Modify all functions which deal with per-pool properties to pass around @pool instead of @gcwq or @cpu. The changes in this patch are mechanical and don't caues any functional difference. This is to prepare for multiple pools per gcwq. Signed-off-by: Tejun Heo <tj@kernel.org>
* workqueue: factor out worker_pool from global_cwqTejun Heo2012-07-122-100/+118
| | | | | | | | | | | | | | | | | Move worklist and all worker management fields from global_cwq into the new struct worker_pool. worker_pool points back to the containing gcwq. worker and cpu_workqueue_struct are updated to point to worker_pool instead of gcwq too. This change is mechanical and doesn't introduce any functional difference other than rearranging of fields and an added level of indirection in some places. This is to prepare for multiple pools per gcwq. v2: Comment typo fixes as suggested by Namhyung. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org>
* workqueue: don't use WQ_HIGHPRI for unbound workqueuesTejun Heo2012-07-121-7/+11
| | | | | | | | | | | | | Unbound wqs aren't concurrency-managed and try to execute work items as soon as possible. This is currently achieved by implicitly setting %WQ_HIGHPRI on all unbound workqueues; however, WQ_HIGHPRI implementation is about to be restructured and this usage won't be valid anymore. Add an explicit chain-wakeup path for unbound workqueues in process_one_work() instead of piggy backing on %WQ_HIGHPRI. Signed-off-by: Tejun Heo <tj@kernel.org>
* Merge tag 'fbdev-fixes-for-3.5-2' of git://github.com/schandinat/linux-2.6Linus Torvalds2012-07-127-22/+33
|\ | | | | | | | | | | | | | | | | | | | | Pull fbdev fixes from Florian Tobias Schandinat: "Two fixes for OMAPDSS by Tomi Valkeinen: - one to avoid warnings when runtime PM is not enabled - one workaround to dependancy issues during suspend/resume" * tag 'fbdev-fixes-for-3.5-2' of git://github.com/schandinat/linux-2.6: OMAPDSS: fix warnings if CONFIG_PM_RUNTIME=n OMAPDSS: Use PM notifiers for system suspend
| * OMAPDSS: fix warnings if CONFIG_PM_RUNTIME=nTomi Valkeinen2012-07-086-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If runtime PM is not enabled in the kernel config, pm_runtime_get_sync() will always return 1 and pm_runtime_put_sync() will always return -ENOSYS. pm_runtime_get_sync() returning 1 presents no problem to the driver, but -ENOSYS from pm_runtime_put_sync() causes the driver to print a warning. One option would be to ignore errors returned by pm_runtime_put_sync() totally, as they only say that the call was unable to put the hardware into suspend mode. However, I chose to ignore the returned -ENOSYS explicitly, and print a warning for other errors, as I think we should get notified if the HW failed to go to suspend properly. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Jassi Brar <jaswinder.singh@linaro.org> Cc: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Archit Taneja <archit@ti.com> Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
| * OMAPDSS: Use PM notifiers for system suspendTomi Valkeinen2012-07-081-16/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current way how omapdss handles system suspend and resume is that omapdss device (a platform device, which is not part of the device hierarchy of the DSS HW devices, like DISPC and DSI, or panels.) uses the suspend and resume callbacks from platform_driver to handle system suspend. It does this by disabling all enabled panels on suspend, and resuming the previously disabled panels on resume. This presents a few problems. One is that as omapdss device is not related to the panel devices or the DSS HW devices, there's no ordering in the suspend process. This means that suspend could be first ran for DSS HW devices and panels, and only then for omapdss device. Currently this is not a problem, as DSS HW devices and panels do not handle suspend. Another, more pressing problem, is that when suspending or resuming, the runtime PM functions return -EACCES as runtime PM is disabled during system suspend. This causes the driver to print warnings, and operations to fail as they think that they failed to bring up the HW. This patch changes the omapdss suspend handling to use PM notifiers, which are called before suspend and after resume. This way we have a normally functioning system when we are suspending and resuming the panels. This patch, I believe, creates a problem that somebody could enable or disable a panel between PM_SUSPEND_PREPARE and the system suspend, and similarly the other way around in resume. I choose to ignore the problem for now, as it sounds rather unlikely, and if it happens, it's not fatal. In the long run the system suspend handling of omapdss and panels should be thought out properly. The current approach feels rather hacky. Perhaps the panel drivers should handle system suspend, or the users of omapdss (omapfb, omapdrm) should handle system suspend. Note that after this patch we could probably revert 0eaf9f52e94f756147dbfe1faf1f77a02378dbf9 (OMAPDSS: use sync versions of pm_runtime_put). But as I said, this patch may be temporary, so let's leave the sync version still in place. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Reported-by: Jassi Brar <jaswinder.singh@linaro.org> Tested-by: Jassi Brar <jaswinder.singh@linaro.org> Tested-by: Joe Woodward <jw@terrafix.co.uk> Signed-off-by: Archit Taneja <archit@ti.com> [fts: fixed 2 brace coding style issues] Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
* | Merge branch 'akpm' (Andrew's patch-bomb)Linus Torvalds2012-07-1238-247/+197Star
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge random patches from Andrew Morton. * Merge emailed patches from Andrew Morton <akpm@linux-foundation.org>: (32 commits) memblock: free allocated memblock_reserved_regions later mm: sparse: fix usemap allocation above node descriptor section mm: sparse: fix section usemap placement calculation xtensa: fix incorrect memset shmem: cleanup shmem_add_to_page_cache shmem: fix negative rss in memcg memory.stat tmpfs: revert SEEK_DATA and SEEK_HOLE drivers/rtc/rtc-twl.c: fix threaded IRQ to use IRQF_ONESHOT fat: fix non-atomic NFS i_pos read MAINTAINERS: add OMAP CPUfreq driver to OMAP Power Management section sgi-xp: nested calls to spin_lock_irqsave() fs: ramfs: file-nommu: add SetPageUptodate() drivers/rtc/rtc-mxc.c: fix irq enabled interrupts warning mm/memory_hotplug.c: release memory resources if hotadd_new_pgdat() fails h8300/uaccess: add mising __clear_user() h8300/uaccess: remove assignment to __gu_val in unhandled case of get_user() h8300/time: add missing #include <asm/irq_regs.h> h8300/signal: fix typo "statis" h8300/pgtable: add missing #include <asm-generic/pgtable.h> drivers/rtc/rtc-ab8500.c: ensure correct probing of the AB8500 RTC when Device Tree is enabled ...
| * | memblock: free allocated memblock_reserved_regions laterYinghai Lu2012-07-123-46/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | memblock_free_reserved_regions() calls memblock_free(), but memblock_free() would double reserved.regions too, so we could free the old range for reserved.regions. Also tj said there is another bug which could be related to this. | I don't think we're saving any noticeable | amount by doing this "free - give it to page allocator - reserve | again" dancing. We should just allocate regions aligned to page | boundaries and free them later when memblock is no longer in use. in that case, when DEBUG_PAGEALLOC, will get panic: memblock_free: [0x0000102febc080-0x0000102febf080] memblock_free_reserved_regions+0x37/0x39 BUG: unable to handle kernel paging request at ffff88102febd948 IP: [<ffffffff836a5774>] __next_free_mem_range+0x9b/0x155 PGD 4826063 PUD cf67a067 PMD cf7fa067 PTE 800000102febd160 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC CPU 0 Pid: 0, comm: swapper Not tainted 3.5.0-rc2-next-20120614-sasha #447 RIP: 0010:[<ffffffff836a5774>] [<ffffffff836a5774>] __next_free_mem_range+0x9b/0x155 See the discussion at https://lkml.org/lkml/2012/6/13/469 So try to allocate with PAGE_SIZE alignment and free it later. Reported-by: Sasha Levin <levinsasha928@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | mm: sparse: fix usemap allocation above node descriptor sectionYinghai Lu2012-07-124-7/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After commit f5bf18fa22f8 ("bootmem/sparsemem: remove limit constraint in alloc_bootmem_section"), usemap allocations may easily be placed outside the optimal section that holds the node descriptor, even if there is space available in that section. This results in unnecessary hotplug dependencies that need to have the node unplugged before the section holding the usemap. The reason is that the bootmem allocator doesn't guarantee a linear search starting from the passed allocation goal but may start out at a much higher address absent an upper limit. Fix this by trying the allocation with the limit at the section end, then retry without if that fails. This keeps the fix from f5bf18fa22f8 of not panicking if the allocation does not fit in the section, but still makes sure to try to stay within the section at first. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: <stable@vger.kernel.org> [3.3.x, 3.4.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | mm: sparse: fix section usemap placement calculationYinghai Lu2012-07-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 238305bb4d41 ("mm: remove sparsemem allocation details from the bootmem allocator") introduced a bug in the allocation goal calculation that put section usemaps not in the same section as the node descriptors, creating unnecessary hotplug dependencies between them: node 0 must be removed before remove section 16399 node 1 must be removed before remove section 16399 node 2 must be removed before remove section 16399 node 3 must be removed before remove section 16399 node 4 must be removed before remove section 16399 node 5 must be removed before remove section 16399 node 6 must be removed before remove section 16399 The reason is that it applies PAGE_SECTION_MASK to the physical address of the node descriptor when finding a suitable place to put the usemap, when this mask is actually intended to be used with PFNs. Because the PFN mask is wider, the target address will point beyond the wanted section holding the node descriptor and the node must be offlined before the section holding the usemap can go. Fix this by extending the mask to address width before use. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | xtensa: fix incorrect memsetAlan Cox2012-07-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Addresses: https://bugzilla.kernel.org/show_bug.cgi?id=43871 Reported-by: <dcb314@hotmail.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | shmem: cleanup shmem_add_to_page_cacheHugh Dickins2012-07-121-30/+28Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | shmem_add_to_page_cache() has three callsites, but only one of them wants the radix_tree_preload() (an exceptional entry guarantees that the radix tree node is present in the other cases), and only that site can achieve mem_cgroup_uncharge_cache_page() (PageSwapCache makes it a no-op in the other cases). We did it this way originally to reflect add_to_page_cache_locked(); but it's confusing now, so move the radix_tree preloading and mem_cgroup uncharging to that one caller. Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | shmem: fix negative rss in memcg memory.statHugh Dickins2012-07-121-12/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When adding the page_private checks before calling shmem_replace_page(), I did realize that there is a further race, but thought it too unlikely to need a hurried fix. But independently I've been chasing why a mem cgroup's memory.stat sometimes shows negative rss after all tasks have gone: I expected it to be a stats gathering bug, but actually it's shmem swapping's fault. It's an old surprise, that when you lock_page(lookup_swap_cache(swap)), the page may have been removed from swapcache before getting the lock; or it may have been freed and reused and be back in swapcache; and it can even be using the same swap location as before (page_private same). The swapoff case is already secure against this (swap cannot be reused until the whole area has been swapped off, and a new swapped on); and shmem_getpage_gfp() is protected by shmem_add_to_page_cache()'s check for the expected radix_tree entry - but a little too late. By that time, we might have already decided to shmem_replace_page(): I don't know of a problem from that, but I'd feel more at ease not to do so spuriously. And we have already done mem_cgroup_cache_charge(), on perhaps the wrong mem cgroup: and this charge is not then undone on the error path, because PageSwapCache ends up preventing that. It's this last case which causes the occasional negative rss in memory.stat: the page is charged here as cache, but (sometimes) found to be anon when eventually it's uncharged - and in between, it's an undeserved charge on the wrong memcg. Fix this by adding an earlier check on the radix_tree entry: it's inelegant to descend the tree twice, but swapping is not the fast path, and a better solution would need a pair (try+commit) of memcg calls, and a rework of shmem_replace_page() to keep out of the swapcache. We can use the added shmem_confirm_swap() function to replace the find_get_page+page_cache_release we were already doing on the error path. And add a comment on that -EEXIST: it seems a peculiar errno to be using, but originates from its use in radix_tree_insert(). [It can be surprising to see positive rss left in a memcg's memory.stat after all tasks have gone, since it is supposed to count anonymous but not shmem. Aside from sharing anon pages via fork with a task in some other memcg, it often happens after swapping: because a swap page can't be freed while under writeback, nor while locked. So it's not an error, and these residual pages are easily freed once pressure demands.] Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | tmpfs: revert SEEK_DATA and SEEK_HOLEHugh Dickins2012-07-121-93/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Revert 4fb5ef089b28 ("tmpfs: support SEEK_DATA and SEEK_HOLE"). I believe it's correct, and it's been nice to have from rc1 to rc6; but as the original commit said: I don't know who actually uses SEEK_DATA or SEEK_HOLE, and whether it would be of any use to them on tmpfs. This code adds 92 lines and 752 bytes on x86_64 - is that bloat or worthwhile? Nobody asked for it, so I conclude that it's bloat: let's revert tmpfs to the dumb generic support for v3.5. We can always reinstate it later if useful, and anyone needing it in a hurry can just get it out of git. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Josef Bacik <josef@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Andreas Dilger <adilger@dilger.ca> Cc: Dave Chinner <david@fromorbit.com> Cc: Marco Stornelli <marco.stornelli@gmail.com> Cc: Jeff liu <jeff.liu@oracle.com> Cc: Chris Mason <chris.mason@fusionio.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | drivers/rtc/rtc-twl.c: fix threaded IRQ to use IRQF_ONESHOTKevin Hilman2012-07-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Requesting a threaded interrupt without a primary handler and without IRQF_ONESHOT is dangerous, and after commit 1c6c6952 ("genirq: Reject bogus threaded irq requests"), these requests are rejected. This causes ->probe() to fail, and the RTC driver not to be availble. To fix, add IRQF_ONESHOT to the IRQ flags. Tested on OMAP3730/OveroSTORM and OMAP4430/Panda board using rtcwake to wake from system suspend multiple times. Signed-off-by: Kevin Hilman <khilman@ti.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | fat: fix non-atomic NFS i_pos readSteven J. Magnani2012-07-121-7/+6Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fat_encode_fh() can fetch an invalid i_pos value on systems where 64-bit accesses are not atomic. Make it use the same accessor as the rest of the FAT code. Signed-off-by: Steven J. Magnani <steve@digidescorp.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | MAINTAINERS: add OMAP CPUfreq driver to OMAP Power Management sectionKevin Hilman2012-07-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the OMAP CPUFreq driver to the list of files in the OMAP Power Management section. I've already been maintaining this driver, this just makes it official. Signed-off-by: Kevin Hilman <khilman@ti.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | sgi-xp: nested calls to spin_lock_irqsave()Dan Carpenter2012-07-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code here has a nested spin_lock_irqsave(). It's not needed since IRQs are already disabled and it causes a problem because it means that IRQs won't be enabled again at the end. The second call to spin_lock_irqsave() will overwrite the value of irq_flags and we can't restore the proper settings. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Robin Holt <holt@sgi.com> Cc: Jack Steiner <steiner@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | fs: ramfs: file-nommu: add SetPageUptodate()Bob Liu2012-07-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a bug in the below scenario for !CONFIG_MMU: 1. create a new file 2. mmap the file and write to it 3. read the file can't get the correct value Because sys_read() -> generic_file_aio_read() -> simple_readpage() -> clear_page() which causes the page to be zeroed. Add SetPageUptodate() to ramfs_nommu_expand_for_mapping() so that generic_file_aio_read() do not call simple_readpage(). Signed-off-by: Bob Liu <lliubbo@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greg Ungerer <gerg@uclinux.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | drivers/rtc/rtc-mxc.c: fix irq enabled interrupts warningBenoît Thébaudeau2012-07-121-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes WARNING: at irq/handle.c:146 handle_irq_event_percpu+0x19c/0x1b8() irq 25 handler mxc_rtc_interrupt+0x0/0xac enabled interrupts Modules linked in: (unwind_backtrace+0x0/0xf0) from (warn_slowpath_common+0x4c/0x64) (warn_slowpath_common+0x4c/0x64) from (warn_slowpath_fmt+0x30/0x40) (warn_slowpath_fmt+0x30/0x40) from (handle_irq_event_percpu+0x19c/0x1b8) (handle_irq_event_percpu+0x19c/0x1b8) from (handle_irq_event+0x28/0x38) (handle_irq_event+0x28/0x38) from (handle_level_irq+0x80/0xc4) (handle_level_irq+0x80/0xc4) from (generic_handle_irq+0x24/0x38) (generic_handle_irq+0x24/0x38) from (handle_IRQ+0x30/0x84) (handle_IRQ+0x30/0x84) from (avic_handle_irq+0x2c/0x4c) (avic_handle_irq+0x2c/0x4c) from (__irq_svc+0x40/0x60) Exception stack(0xc050bf60 to 0xc050bfa8) bf60: 00000001 00000000 003c4208 c0018e20 c050a000 c050a000 c054a4c8 c050a000 bf80: c05157a8 4117b363 80503bb4 00000000 01000000 c050bfa8 c0018e2c c000e808 bfa0: 60000013 ffffffff (__irq_svc+0x40/0x60) from (default_idle+0x1c/0x30) (default_idle+0x1c/0x30) from (cpu_idle+0x68/0xa8) (cpu_idle+0x68/0xa8) from (start_kernel+0x22c/0x26c) Signed-off-by: Benoît Thébaudeau <benoit.thebaudeau@advansee.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Sascha Hauer <kernel@pengutronix.de> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | mm/memory_hotplug.c: release memory resources if hotadd_new_pgdat() failsWen Congyang2012-07-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should goto error to release memory resource if hotadd_new_pgdat() failed. Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Cc: Yasuaki ISIMATU <isimatu.yasuaki@jp.fujitsu.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Len Brown <lenb@kernel.org> Cc: "Brown, Len" <len.brown@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | h8300/uaccess: add mising __clear_user()Geert Uytterhoeven2012-07-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the build error: include/linux/regset.h: In function 'user_regset_copyout_zero': include/linux/regset.h:289:3: error: implicit declaration of function '__clear_user' [-Werror=implicit-function-declaration] Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | h8300/uaccess: remove assignment to __gu_val in unhandled case of get_user()Geert Uytterhoeven2012-07-121-1/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __gu_val is const if the passed ptr is const, giving: include/linux/pagemap.h: In function 'fault_in_pages_readable': include/linux/pagemap.h:442:2: error: assignment of read-only variable '__gu_val' include/linux/pagemap.h:448:4: error: assignment of read-only variable '__gu_val' include/linux/pagemap.h: In function 'fault_in_multipages_readable': include/linux/pagemap.h:499:3: error: assignment of read-only variable '__gu_val' include/linux/pagemap.h:508:3: error: assignment of read-only variable '__gu_val' make[4]: *** [init/main.o] Error 1 As we don't care about the actual value of __gu_val in the unhandled case (it will cause a link error anyway), just remove the assignment. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | h8300/time: add missing #include <asm/irq_regs.h>Geert Uytterhoeven2012-07-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the build error: arch/h8300/kernel/time.c: In function 'h8300_timer_tick': arch/h8300/kernel/time.c:39:2: error: implicit declaration of function 'get_irq_regs' [-Werror=implicit-function-declaration] arch/h8300/kernel/time.c:39:42: error: invalid type argument of '->' (have 'int') Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | h8300/signal: fix typo "statis"Geert Uytterhoeven2012-07-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The keyword is "static", not "statis": arch/h8300/kernel/signal.c:455:8: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'void' arch/h8300/kernel/signal.c: In function 'do_notify_resume': arch/h8300/kernel/signal.c:511:3: error: implicit declaration of function 'do_signal' [-Werror=implicit-function-declaration] arch/h8300/kernel/signal.c: At top level: arch/h8300/kernel/signal.c:414:1: warning: 'handle_signal' defined but not used [-Wunused-function] Introduced in commit 7ae4e32a6514 ("h8300: switch to saved_sigmask-based sigsuspend/rt_sigsuspend") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | h8300/pgtable: add missing #include <asm-generic/pgtable.h>Geert Uytterhoeven2012-07-121-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the h8300 build error: kernel/sched/core.c: In function 'context_switch': kernel/sched/core.c:2061:2: error: implicit declaration of function 'arch_start_context_switch' [-Werror=implicit-function-declaration] Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | drivers/rtc/rtc-ab8500.c: ensure correct probing of the AB8500 RTC when ↵Lee Jones2012-07-121-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Device Tree is enabled Without this patch, if Device Tree is enabled the AB8500 RTC wouldn't get probed at all, as there is no reference to it from platform code. This patch ensures the driver is probed during normal DT start-up. [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: Lee Jones <lee.jones@linaro.org> Cc: Alessandro Zummo <a.zummo@towertech.it> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | drivers/rtc/rtc-ab8500.c: use IRQF_ONESHOT when requesting a threaded IRQLee Jones2012-07-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This driver's IRQ registration is failing because the kernel now forces IRQs to be ONESHOT if no IRQ handler is passed. Signed-off-by: Lee Jones <lee.jones@linaro.org> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | mm, thp: abort compaction if migration page cannot be charged to memcgDavid Rientjes2012-07-121-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If page migration cannot charge the temporary page to the memcg, migrate_pages() will return -ENOMEM. This isn't considered in memory compaction however, and the loop continues to iterate over all pageblocks trying to isolate and migrate pages. If a small number of very large memcgs happen to be oom, however, these attempts will mostly be futile leading to an enormous amout of cpu consumption due to the page migration failures. This patch will short circuit and fail memory compaction if migrate_pages() returns -ENOMEM. COMPACT_PARTIAL is returned in case some migrations were successful so that the page allocator will retry. Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Minchan Kim <minchan@kernel.org> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | c/r: prctl: less paranoid prctl_set_mm_exe_file()Konstantin Khlebnikov2012-07-121-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "no other files mapped" requirement from my previous patch (c/r: prctl: update prctl_set_mm_exe_file() after mm->num_exe_file_vmas removal) is too paranoid, it forbids operation even if there mapped one shared-anon vma. Let's check that current mm->exe_file already unmapped, in this case exe_file symlink already outdated and its changing is reasonable. Plus, this patch fixes exit code in case operation success. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Reported-by: Cyrill Gorcunov <gorcunov@openvz.org> Tested-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Kees Cook <keescook@chromium.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | ocfs2: fix NULL pointer dereference in __ocfs2_change_file_space()Luis Henriques2012-07-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As ocfs2_fallocate() will invoke __ocfs2_change_file_space() with a NULL as the first parameter (file), it may trigger a NULL pointer dereferrence due to a missing check. Addresses http://bugs.launchpad.net/bugs/1006012 Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Reported-by: Bret Towe <magnade@gmail.com> Tested-by: Bret Towe <magnade@gmail.com> Cc: Sunil Mushran <sunil.mushran@oracle.com> Acked-by: Joel Becker <jlbec@evilplan.org> Acked-by: Mark Fasheh <mfasheh@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | mn10300: use "#elif defined(CONFIG_*)" instead of "#elif CONFIG_*"Geert Uytterhoeven2012-07-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the warnings: arch/mn10300/kernel/irq.c:173:7: warning: "CONFIG_MN10300_TTYSM1_TIMER9" is not defined [-Wundef] arch/mn10300/kernel/irq.c:175:7: warning: "CONFIG_MN10300_TTYSM1_TIMER3" is not defined [-Wundef] Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: David Howells <dhowells@redhat.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | mn10300: mm/dma-alloc.c needs <linux/export.h>Geert Uytterhoeven2012-07-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the warnings: arch/mn10300/mm/dma-alloc.c: At top level: arch/mn10300/mm/dma-alloc.c:63:1: warning: data definition has no type or storage class [enabled by default] arch/mn10300/mm/dma-alloc.c:63:1: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL' [-Wimplicit-int] arch/mn10300/mm/dma-alloc.c:63:1: warning: parameter names (without types) in function declaration [enabled by default] arch/mn10300/mm/dma-alloc.c:75:1: warning: data definition has no type or storage class [enabled by default] arch/mn10300/mm/dma-alloc.c:75:1: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL' [-Wimplicit-int] arch/mn10300/mm/dma-alloc.c:75:1: warning: parameter names (without types) in function declaration [enabled by default] Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: David Howells <dhowells@redhat.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | mn10300: kernel/traps.c needs <linux/export.h>Geert Uytterhoeven2012-07-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the warning: arch/mn10300/kernel/traps.c:304:1: warning: data definition has no type or storage class [enabled by default] arch/mn10300/kernel/traps.c:304:1: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL' [-Wimplicit-int] arch/mn10300/kernel/traps.c:304:1: warning: parameter names (without types) in function declaration [enabled by default] Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: David Howells <dhowells@redhat.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | mn10300: kernel/internal.h needs <linux/irqreturn.h>Geert Uytterhoeven2012-07-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the nm10300 build failure: In file included from arch/mn10300/kernel/csrc-mn10300.c:14:0: arch/mn10300/kernel/internal.h:42:1: error: unknown type name 'irqreturn_t' Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: David Howells <dhowells@redhat.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | mn10300: remove duplicate definition of PTRACE_O_TRACESYSGOODGeert Uytterhoeven2012-07-121-3/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the warning: include/linux/ptrace.h:66:0: warning: "PTRACE_O_TRACESYSGOOD" redefined [enabled by default] arch/mn10300/include/asm/ptrace.h:85:0: note: this is the location of the previous definition We already have it in <linux/ptrace.h>, so remove it from <asm/ptrace.h> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: David Howells <dhowells@redhat.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | mn10300: move setup_jiffies_interrupt() to cevt-mn10300.cGeert Uytterhoeven2012-07-127-23/+12Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the static inline function setup_jiffies_interrupt() from <asm/timex.h> to arch/mn10300/kernel/cevt-mn10300.c, which is its only callsite. This allows to remove the inclusion of <asm/hardirq.h> and <linux/irq.h> from <asm/timex.h> and <unit/timex.h>, fixing include hell like: include/linux/jiffies.h:260:31: warning: "CLOCK_TICK_RATE" is not defined [-Wundef] include/linux/jiffies.h:260:31: warning: "CLOCK_TICK_RATE" is not defined [-Wundef] include/linux/jiffies.h:46:42: error: division by zero in #if ... make[4]: *** [arch/mn10300/kernel/asm-offsets.s] Error 1 and (after a quick hack for the above by defining CLOCK_TICK_RATE in <linux/jiffies.h>): In file included from include/linux/notifier.h:15:0, from include/linux/memory_hotplug.h:6, from include/linux/mmzone.h:718, from include/linux/gfp.h:4, from include/linux/irq.h:20, from arch/mn10300/unit-asb2303/include/unit/timex.h:15, from arch/mn10300/include/asm/timex.h:15, from include/linux/timex.h:174, from include/linux/jiffies.h:8, from include/linux/ktime.h:25, from include/linux/timer.h:5, from include/linux/workqueue.h:8, include/linux/srcu.h:55:22: error: field 'work' has incomplete type As a consequence, we do need a few more inclusions of <asm/irq.h>, namely in arch/mn10300/unit-asb2303/smc91111.c and arch/mn10300/unit-asb2305/unit-init.c. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: David Howells <dhowells@redhat.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | drivers/rtc/rtc-spear.c: fix use-after-free in spear_rtc_remove()Devendra Naga2012-07-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `config' is freed and is then used in the rtc_device_unregister() call, causing a kernel panic. Signed-off-by: Devendra Naga <devendra.aaru@gmail.com> Reviewed-by: Viresh Kumar <viresh.linux@gmail.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | memory hotplug: fix invalid memory access caused by stale kswapd pointerJiang Liu2012-07-122-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kswapd_stop() is called to destroy the kswapd work thread when all memory of a NUMA node has been offlined. But kswapd_stop() only terminates the work thread without resetting NODE_DATA(nid)->kswapd to NULL. The stale pointer will prevent kswapd_run() from creating a new work thread when adding memory to the memory-less NUMA node again. Eventually the stale pointer may cause invalid memory access. An example stack dump as below. It's reproduced with 2.6.32, but latest kernel has the same issue. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81051a94>] exit_creds+0x12/0x78 PGD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/system/memory/memory391/state CPU 11 Modules linked in: cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq microcode fuse loop dm_mod tpm_tis rtc_cmos i2c_i801 rtc_core tpm serio_raw pcspkr sg tpm_bios igb i2c_core iTCO_wdt rtc_lib mptctl iTCO_vendor_support button dca bnx2 usbhid hid uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif edd ext3 mbcache jbd fan ide_pci_generic ide_core ata_generic ata_piix libata thermal processor thermal_sys hwmon mptsas mptscsih mptbase scsi_transport_sas scsi_mod Pid: 7949, comm: sh Not tainted 2.6.32.12-qiuxishi-5-default #92 Tecal RH2285 RIP: 0010:exit_creds+0x12/0x78 RSP: 0018:ffff8806044f1d78 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff880604f22140 RCX: 0000000000019502 RDX: 0000000000000000 RSI: 0000000000000202 RDI: 0000000000000000 RBP: ffff880604f22150 R08: 0000000000000000 R09: ffffffff81a4dc10 R10: 00000000000032a0 R11: ffff880006202500 R12: 0000000000000000 R13: 0000000000c40000 R14: 0000000000008000 R15: 0000000000000001 FS: 00007fbc03d066f0(0000) GS:ffff8800282e0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 000000060f029000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process sh (pid: 7949, threadinfo ffff8806044f0000, task ffff880603d7c600) Stack: ffff880604f22140 ffffffff8103aac5 ffff880604f22140 ffffffff8104d21e ffff880006202500 0000000000008000 0000000000c38000 ffffffff810bd5b1 0000000000000000 ffff880603d7c600 00000000ffffdd29 0000000000000003 Call Trace: __put_task_struct+0x5d/0x97 kthread_stop+0x50/0x58 offline_pages+0x324/0x3da memory_block_change_state+0x179/0x1db store_mem_state+0x9e/0xbb sysfs_write_file+0xd0/0x107 vfs_write+0xad/0x169 sys_write+0x45/0x6e system_call_fastpath+0x16/0x1b Code: ff 4d 00 0f 94 c0 84 c0 74 08 48 89 ef e8 1f fd ff ff 5b 5d 31 c0 41 5c c3 53 48 8b 87 20 06 00 00 48 89 fb 48 8b bf 18 06 00 00 <8b> 00 48 c7 83 18 06 00 00 00 00 00 00 f0 ff 0f 0f 94 c0 84 c0 RIP exit_creds+0x12/0x78 RSP <ffff8806044f1d78> CR2: 0000000000000000 [akpm@linux-foundation.org: add pglist_data.kswapd locking comments] Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: David Rientjes <rientjes@google.com> Reviewed-by: Minchan Kim <minchan@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge branch 'stable' of ↵Linus Torvalds2012-07-111-2/+7
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile Pull arch/tile fix from Chris Metcalf: "This is a single change to fix backtracing in big-endian mode." * 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: arch/tile: big-endian: properly bswap instruction bundles when backtracing
| * | | arch/tile: big-endian: properly bswap instruction bundles when backtracingChris Metcalf2012-06-181-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instruction bundles are always little-endian, even when running in big-endian mode. I missed this internal bug fix when cherry-picking the big-endian code to return to the community. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
* | | | Merge tag 'scsi-fixes' of ↵Linus Torvalds2012-07-117-17/+25
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is a set of three fixes for data corruption (libsas task file), oops causing (NULL in scsi_cmd_to_driver) and driver failure (bnx2i). The oops caused by the NULL in scsi_cmd_to_driver() manifests in scsi_eh_send_cmd() and has been seen by several people now. Signed-off-by: James Bottomley <JBottomley@Parallels.com>" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: [SCSI] bnx2i: Removed the reference to the netdev->base_addr [SCSI] libsas: fix taskfile corruption in sas_ata_qc_fill_rtf [SCSI] Fix NULL dereferences in scsi_cmd_to_driver