summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'for-3.5' of ↵Linus Torvalds2012-05-2318-438/+687
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "cgroup file type addition / removal is updated so that file types are added and removed instead of individual files so that dynamic file type addition / removal can be implemented by cgroup and used by controllers. blkio controller changes which will come through block tree are dependent on this. Other changes include res_counter cleanup and disallowing kthread / PF_THREAD_BOUND threads to be attached to non-root cgroups. There's a reported bug with the file type addition / removal handling which can lead to oops on cgroup umount. The issue is being looked into. It shouldn't cause problems for most setups and isn't a security concern." Fix up trivial conflict in Documentation/feature-removal-schedule.txt * 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (21 commits) res_counter: Account max_usage when calling res_counter_charge_nofail() res_counter: Merge res_counter_charge and res_counter_charge_nofail cgroups: disallow attaching kthreadd or PF_THREAD_BOUND threads cgroup: remove cgroup_subsys->populate() cgroup: get rid of populate for memcg cgroup: pass struct mem_cgroup instead of struct cgroup to socket memcg cgroup: make css->refcnt clearing on cgroup removal optional cgroup: use negative bias on css->refcnt to block css_tryget() cgroup: implement cgroup_rm_cftypes() cgroup: introduce struct cfent cgroup: relocate __d_cgrp() and __d_cft() cgroup: remove cgroup_add_file[s]() cgroup: convert memcg controller to the new cftype interface memcg: always create memsw files if CONFIG_CGROUP_MEM_RES_CTLR_SWAP cgroup: convert all non-memcg controllers to the new cftype interface cgroup: relocate cftype and cgroup_subsys definitions in controllers cgroup: merge cft_release_agent cftype array into the base files array cgroup: implement cgroup_add_cftypes() and friends cgroup: build list of all cgroups under a given cgroupfs_root cgroup: move cgroup_clear_directory() call out of cgroup_populate_dir() ...
| * res_counter: Account max_usage when calling res_counter_charge_nofail()Frederic Weisbecker2012-04-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Updating max_usage is something one would expect when we reach a new maximum usage value even when we do this by forcing through the limit with res_counter_charge_nofail(). (Whether we want to account failcnt when we force through the limit is another debate). Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Glauber Costa <glommer@parallels.com> Acked-by: Kirill A. Shutemov <kirill@shutemov.name> Cc: Li Zefan <lizefan@huawei.com>
| * res_counter: Merge res_counter_charge and res_counter_charge_nofailFrederic Weisbecker2012-04-273-42/+37Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These two functions do almost the same thing and duplicate some code. Merge their implementation into a single common function. res_counter_charge_locked() takes one more parameter but it doesn't seem to be used outside res_counter.c yet anyway. There is no (intended) change in the behaviour. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Glauber Costa <glommer@parallels.com> Acked-by: Kirill A. Shutemov <kirill@shutemov.name> Cc: Li Zefan <lizefan@huawei.com>
| * cgroups: disallow attaching kthreadd or PF_THREAD_BOUND threadsMike Galbraith2012-04-231-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allowing kthreadd to be moved to a non-root group makes no sense, it being a global resource, and needlessly leads unsuspecting users toward trouble. 1. An RT workqueue worker thread spawned in a task group with no rt_runtime allocated is not schedulable. Simple user error, but harmful to the box. 2. A worker thread which acquires PF_THREAD_BOUND can never leave a cpuset, rendering the cpuset immortal. Save the user some unexpected trouble, just say no. Signed-off-by: Mike Galbraith <mgalbraith@suse.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * cgroup: remove cgroup_subsys->populate()Tejun Heo2012-04-112-4/+0Star
| | | | | | | | | | | | | | | | With memcg converted, cgroup_subsys->populate() doesn't have any user left. Remove it. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com>
| * cgroup: get rid of populate for memcgGlauber Costa2012-04-101-10/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The last man standing justifying the need for populate() is the sock memcg initialization functions. Now that we are able to pass a struct mem_cgroup instead of a struct cgroup to the socket initialization, there is nothing that stops us from initializing everything in create(). Signed-off-by: Glauber Costa <glommer@parallels.com> Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org> CC: Li Zefan <lizefan@huawei.com> CC: Johannes Weiner <hannes@cmpxchg.org> CC: Michal Hocko <mhocko@suse.cz>
| * cgroup: pass struct mem_cgroup instead of struct cgroup to socket memcgGlauber Costa2012-04-105-32/+24Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The only reason cgroup was used, was to be consistent with the populate() interface. Now that we're getting rid of it, not only we no longer need it, but we also *can't* call it this way. Since we will no longer rely on populate(), this will be called from create(). During create, the association between struct mem_cgroup and struct cgroup does not yet exist, since cgroup internals hasn't yet initialized its bookkeeping. This means we would not be able to draw the memcg pointer from the cgroup pointer in these functions, which is highly undesirable. Signed-off-by: Glauber Costa <glommer@parallels.com> Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Tejun Heo <tj@kernel.org> CC: Li Zefan <lizefan@huawei.com> CC: Johannes Weiner <hannes@cmpxchg.org> CC: Michal Hocko <mhocko@suse.cz>
| * cgroup: make css->refcnt clearing on cgroup removal optionalTejun Heo2012-04-013-9/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, cgroup removal tries to drain all css references. If there are active css references, the removal logic waits and retries ->pre_detroy() until either all refs drop to zero or removal is cancelled. This semantics is unusual and adds non-trivial complexity to cgroup core and IMHO is fundamentally misguided in that it couples internal implementation details (references to internal data structure) with externally visible operation (rmdir). To userland, this is a behavior peculiarity which is unnecessary and difficult to expect (css refs is otherwise invisible from userland), and, to policy implementations, this is an unnecessary restriction (e.g. blkcg wants to hold css refs for caching purposes but can't as that becomes visible as rmdir hang). Unfortunately, memcg currently depends on ->pre_destroy() retrials and cgroup removal vetoing and can't be immmediately switched to the new behavior. This patch introduces the new behavior of not waiting for css refs to drain and maintains the old behavior for subsystems which have __DEPRECATED_clear_css_refs set. Once, memcg is updated, we can drop the code paths for the old behavior as proposed in the following patch. Note that the following patch is incorrect in that dput work item is in cgroup and may lose some of dputs when multiples css's are released back-to-back, and __css_put() triggers check_for_release() when refcnt reaches 0 instead of 1; however, it shows what part can be removed. http://thread.gmane.org/gmane.linux.kernel.containers/22559/focus=75251 Note that, in not-too-distant future, cgroup core will start emitting warning messages for subsys which require the old behavior, so please get moving. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
| * cgroup: use negative bias on css->refcnt to block css_tryget()Tejun Heo2012-04-012-56/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a cgroup is about to be removed, cgroup_clear_css_refs() is called to check and ensure that there are no active css references. This is currently achieved by dropping the refcnt to zero iff it has only the base ref. If all css refs could be dropped to zero, ref clearing is successful and CSS_REMOVED is set on all css. If not, the base ref is restored. While css ref is zero w/o CSS_REMOVED set, any css_tryget() attempt on it busy loops so that they are atomic w.r.t. the whole css ref clearing. This does work but dropping and re-instating the base ref is somewhat hairy and makes it difficult to add more logic to the put path as there are two of them - the regular css_put() and the reversible base ref clearing. This patch updates css ref clearing such that blocking new css_tryget() and putting the base ref are separate operations. CSS_DEACT_BIAS, defined as INT_MIN, is added to css->refcnt and css_tryget() busy loops while refcnt is negative. After all css refs are deactivated, if they were all one, ref clearing succeeded and CSS_REMOVED is set and the base ref is put using the regular css_put(); otherwise, CSS_DEACT_BIAS is subtracted from the refcnts and the original postive values are restored. css_refcnt() accessor which always returns the unbiased positive reference counts is added and used to simplify refcnt usages. While at it, relocate and reformat comments in cgroup_has_css_refs(). This separates css->refcnt deactivation and putting the base ref, which enables the next patch to make ref clearing optional. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
| * cgroup: implement cgroup_rm_cftypes()Tejun Heo2012-04-012-10/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | Implement cgroup_rm_cftypes() which removes an array of cftypes from a subsystem. It can be called whether the target subsys is attached or not. cgroup core will remove the specified file from all existing cgroups. This will be used to improve sub-subsys modularity and will be helpful for unified hierarchy. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
| * cgroup: introduce struct cfentTejun Heo2012-04-012-36/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds cfent (cgroup file entry) which is the association between a cgroup and a file. This is in-cgroup representation of files under a cgroup directory. This simplifies walking walking cgroup files and thus cgroup_clear_directory(), which is now implemented in two parts - cgroup_rm_file() and a loop around it. cgroup_rm_file() will be used to implement cftype removal and cfent is scheduled to serve cgroup specific per-file data (e.g. for sysfs-like "sever" semantics). v2: - cfe was freed from cgroup_rm_file() which led to use-after-free if the file had openers at the time of removal. Moved to cgroup_diput(). - cgroup_clear_directory() triggered WARN_ON_ONCE() if d_subdirs wasn't empty after removing all files. This triggered spuriously if some files were open during directory clearing. Removed. v3: - In cgroup_diput(), WARN_ONCE(!list_empty(&cfe->node)) could be spuriously triggered for root cgroups because they don't go through cgroup_clear_directory() on unmount. Don't trigger WARN for root cgroups. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Glauber Costa <glommer@parallels.com>
| * cgroup: relocate __d_cgrp() and __d_cft()Tejun Heo2012-04-011-10/+10
| | | | | | | | | | | | | | Move the two macros upwards as they'll be used earlier in the file. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
| * cgroup: remove cgroup_add_file[s]()Tejun Heo2012-04-012-47/+20Star
| | | | | | | | | | | | | | | | | | | | No controller is using cgroup_add_files[s](). Unexport them, and convert cgroup_add_files() to handle NULL entry terminated array instead of taking count explicitly and continue creation on failure for internal use. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
| * cgroup: convert memcg controller to the new cftype interfaceTejun Heo2012-04-012-15/+13Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert memcg to use the new cftype based interface. kmem support abuses ->populate() for mem_cgroup_sockets_init() so it can't be removed at the moment. tcp_memcontrol is updated so that tcp_files[] is registered via a __initcall. This change also allows removing the forward declaration of tcp_files[]. Removed. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Glauber Costa <glommer@parallels.com> Cc: Hugh Dickins <hughd@google.com> Cc: Greg Thelen <gthelen@google.com>
| * memcg: always create memsw files if CONFIG_CGROUP_MEM_RES_CTLR_SWAPTejun Heo2012-04-011-34/+31Star
| | | | | | | | | | | | | | | | | | | | | | | | | | Instead of conditioning creation of memsw files on do_swap_account, always create the files if compiled-in and fail read/write attempts with -EOPNOTSUPP if !do_swap_account. This is suggested by KAMEZAWA to simplify memcg file creation so that it can use cgroup->subsys_cftypes. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
| * cgroup: convert all non-memcg controllers to the new cftype interfaceTejun Heo2012-04-018-75/+28Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert debug, freezer, cpuset, cpu_cgroup, cpuacct, net_prio, blkio, net_cls and device controllers to use the new cftype based interface. Termination entry is added to cftype arrays and populate callbacks are replaced with cgroup_subsys->base_cftypes initializations. This is functionally identical transformation. There shouldn't be any visible behavior change. memcg is rather special and will be converted separately. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Paul Menage <paul@paulmenage.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Vivek Goyal <vgoyal@redhat.com>
| * cgroup: relocate cftype and cgroup_subsys definitions in controllersTejun Heo2012-04-014-83/+65Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | blk-cgroup, netprio_cgroup, cls_cgroup and tcp_memcontrol unnecessarily define cftype array and cgroup_subsys structures at the top of the file, which is unconventional and necessiates forward declaration of methods. This patch relocates those below the definitions of the methods and removes the forward declarations. Note that forward declaration of tcp_files[] is added in tcp_memcontrol.c for tcp_init_cgroup(). This will be removed soon by another patch. This patch doesn't introduce any functional change. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
| * cgroup: merge cft_release_agent cftype array into the base files arrayTejun Heo2012-04-011-12/+7Star
| | | | | | | | | | | | | | | | Now that cftype can express whether a file should only be on root, cft_release_agent can be merged into the base files cftypes array. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
| * cgroup: implement cgroup_add_cftypes() and friendsTejun Heo2012-04-012-3/+162
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, cgroup directories are populated by subsys->populate() callback explicitly creating files on each cgroup creation. This level of flexibility isn't needed or desirable. It provides largely unused flexibility which call for abuses while severely limiting what the core layer can do through the lack of structure and conventions. Per each cgroup file type, the only distinction that cgroup users is making is whether a cgroup is root or not, which can easily be expressed with flags. This patch introduces cgroup_add_cftypes(). These deal with cftypes instead of individual files - controllers indicate that certain types of files exist for certain subsystem. Newly added CFTYPE_*_ON_ROOT flags indicate whether a cftype should be excluded or created only on the root cgroup. cgroup_add_cftypes() can be called any time whether the target subsystem is currently attached or not. cgroup core will create files on the existing cgroups as necessary. Also, cgroup_subsys->base_cftypes is added to ease registration of the base files for the subsystem. If non-NULL on subsys init, the cftypes pointed to by ->base_cftypes are automatically registered on subsys init / load. Further patches will convert the existing users and remove the file based interface. Note that this interface allows dynamic addition of files to an active controller. This will be used for sub-controller modularity and unified hierarchy in the longer term. This patch implements the new mechanism but doesn't apply it to any user. v2: replaced DECLARE_CGROUP_CFTYPES[_COND]() with cgroup_subsys->base_cftypes, which works better for cgroup_subsys which is loaded as module. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
| * cgroup: build list of all cgroups under a given cgroupfs_rootTejun Heo2012-04-012-0/+12
| | | | | | | | | | | | | | | | | | Build a list of all cgroups anchored at cgroupfs_root->allcg_list and going through cgroup->allcg_node. The list is protected by cgroup_mutex and will be used to improve cgroup file handling. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
| * cgroup: move cgroup_clear_directory() call out of cgroup_populate_dir()Tejun Heo2012-04-011-4/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | cgroup_populate_dir() currently clears all files and then repopulate the directory; however, the clearing part is only useful when it's called from cgroup_remount(). Relocate the invocation to cgroup_remount(). This is to prepare for further cgroup file handling updates. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com>
| * cgroup: deprecate remount option changesTejun Heo2012-04-012-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch marks the following features for deprecation. * Rebinding subsys by remount: Never reached useful state - only works on empty hierarchies. * release_agent update by remount: release_agent itself will be replaced with conventional fsnotify notification. v2: Lennart pointed out that "name=" is necessary for mounts w/o any controller attached. Drop "name=" deprecation. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Lennart Poettering <mzxreary@0pointer.de>
* | Merge branch 'for-3.5' of ↵Linus Torvalds2012-05-2325-124/+69Star
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu updates from Tejun Heo: "Contains Alex Shi's three patches to remove percpu_xxx() which overlap with this_cpu_xxx(). There shouldn't be any functional change." * 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: remove percpu_xxx() functions x86: replace percpu_xxx funcs with this_cpu_xxx net: replace percpu_xxx funcs with this_cpu_xxx or __this_cpu_xxx
| * | percpu: remove percpu_xxx() functionsAlex Shi2012-05-142-64/+6Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove percpu_xxx serial functions, all of them were replaced by this_cpu_xxx or __this_cpu_xxx serial functions Signed-off-by: Alex Shi <alex.shi@intel.com> Acked-by: Christoph Lameter <cl@gentwo.org> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | x86: replace percpu_xxx funcs with this_cpu_xxxAlex Shi2012-05-1422-52/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since percpu_xxx() serial functions are duplicated with this_cpu_xxx(). Removing percpu_xxx() definition and replacing them by this_cpu_xxx() in code. There is no function change in this patch, just preparation for later percpu_xxx serial function removing. On x86 machine the this_cpu_xxx() serial functions are same as __this_cpu_xxx() without no unnecessary premmpt enable/disable. Thanks for Stephen Rothwell, he found and fixed a i386 build error in the patch. Also thanks for Andrew Morton, he kept updating the patchset in Linus' tree. Signed-off-by: Alex Shi <alex.shi@intel.com> Acked-by: Christoph Lameter <cl@gentwo.org> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | net: replace percpu_xxx funcs with this_cpu_xxx or __this_cpu_xxxAlex Shi2012-05-142-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | percpu_xxx funcs are duplicated with this_cpu_xxx funcs, so replace them for further code clean up. And in preempt safe scenario, __this_cpu_xxx funcs may has a bit better performance since __this_cpu_xxx has no redundant preempt_enable/preempt_disable on some architectures. Signed-off-by: Alex Shi <alex.shi@intel.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tejun Heo <tj@kernel.org>
* | | Merge branch 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds2012-05-235-306/+38Star
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull workqueue changes from Tejun Heo: "Nothing exciting. Most are updates to debug stuff and related fixes. Two not-too-critical bugs are fixed - WARN_ON() triggering spurious during cpu offlining and unlikely lockdep related oops." * 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: lockdep: fix oops in processing workqueue workqueue: skip nr_running sanity check in worker_enter_idle() if trustee is active workqueue: Catch more locking problems with flush_work() workqueue: change BUG_ON() to WARN_ON() trace: Remove unused workqueue tracer
| * | | lockdep: fix oops in processing workqueuePeter Zijlstra2012-05-153-2/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Under memory load, on x86_64, with lockdep enabled, the workqueue's process_one_work() has been seen to oops in __lock_acquire(), barfing on a 0xffffffff00000000 pointer in the lockdep_map's class_cache[]. Because it's permissible to free a work_struct from its callout function, the map used is an onstack copy of the map given in the work_struct: and that copy is made without any locking. Surprisingly, gcc (4.5.1 in Hugh's case) uses "rep movsl" rather than "rep movsq" for that structure copy: which might race with a workqueue user's wait_on_work() doing lock_map_acquire() on the source of the copy, putting a pointer into the class_cache[], but only in time for the top half of that pointer to be copied to the destination map. Boom when process_one_work() subsequently does lock_map_acquire() on its onstack copy of the lockdep_map. Fix this, and a similar instance in call_timer_fn(), with a lockdep_copy_map() function which additionally NULLs the class_cache[]. Note: this oops was actually seen on 3.4-next, where flush_work() newly does the racing lock_map_acquire(); but Tejun points out that 3.4 and earlier are already vulnerable to the same through wait_on_work(). * Patch orginally from Peter. Hugh modified it a bit and wrote the description. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Reported-by: Hugh Dickins <hughd@google.com> LKML-Reference: <alpine.LSU.2.00.1205070951170.1544@eggly.anvils> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | workqueue: skip nr_running sanity check in worker_enter_idle() if trustee is ↵Tejun Heo2012-05-151-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | active worker_enter_idle() has WARN_ON_ONCE() which triggers if nr_running isn't zero when every worker is idle. This can trigger spuriously while a cpu is going down due to the way trustee sets %WORKER_ROGUE and zaps nr_running. It first sets %WORKER_ROGUE on all workers without updating nr_running, releases gcwq->lock, schedules, regrabs gcwq->lock and then zaps nr_running. If the last running worker enters idle inbetween, it would see stale nr_running which hasn't been zapped yet and trigger the WARN_ON_ONCE(). Fix it by performing the sanity check iff the trustee is idle. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: stable@vger.kernel.org
| * | | workqueue: Catch more locking problems with flush_work()Stephen Boyd2012-04-231-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a workqueue is flushed with flush_work() lockdep checking can be circumvented. For example: static DEFINE_MUTEX(mutex); static void my_work(struct work_struct *w) { mutex_lock(&mutex); mutex_unlock(&mutex); } static DECLARE_WORK(work, my_work); static int __init start_test_module(void) { schedule_work(&work); return 0; } module_init(start_test_module); static void __exit stop_test_module(void) { mutex_lock(&mutex); flush_work(&work); mutex_unlock(&mutex); } module_exit(stop_test_module); would not always print a warning when flush_work() was called. In this trivial example nothing could go wrong since we are guaranteed module_init() and module_exit() don't run concurrently, but if the work item is schedule asynchronously we could have a scenario where the work item is running just at the time flush_work() is called resulting in a classic ABBA locking problem. Add a lockdep hint by acquiring and releasing the work item lockdep_map in flush_work() so that we always catch this potential deadlock scenario. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Reviewed-by: Yong Zhang <yong.zhang0@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | workqueue: change BUG_ON() to WARN_ON()Dan Carpenter2012-04-161-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This BUG_ON() can be triggered if you call schedule_work() before calling INIT_WORK(). It is a bug definitely, but it's nicer to just print a stack trace and return. Reported-by: Matt Renzelmann <mjr@cs.wisc.edu> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | trace: Remove unused workqueue tracerStephen Boyd2012-04-112-301/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This tracer was temporarily removed in 6416669 (workqueue: temporarily remove workqueue tracing, 2010-06-29) but never reinstated after concurrency managed workqueues were completed. For almost two years it hasn't been compilable so it seems nobody is using it. Delete it. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
* | | | Merge tag 'staging-3.5-rc1' of ↵Linus Torvalds2012-05-23611-25614/+27821
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging tree changes from Greg Kroah-Hartman: "Here is the big staging tree pull request for the 3.5-rc1 merge window. Loads of changes here, and we just narrowly added more lines than we added: 622 files changed, 28356 insertions(+), 26059 deletions(-) But, good news is that there is a number of subsystems that moved out of the staging tree, to their respective "real" portions of the kernel. Code that moved out was: - iio core code - mei driver - vme core and bridge drivers There was one broken network driver that moved into staging as a step before it is removed from the tree (pc300), and there was a few new drivers added to the tree: - new iio drivers - gdm72xx wimax USB driver - ipack subsystem and 2 drivers All of the movements around have acks from the various subsystem maintainers, and all of this has been in the linux-next tree for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" Fixed up various trivial conflicts, along with a non-trivial one found in -next and pointed out by Olof Johanssen: a clean - but incorrect - merge of the arch/arm/boot/dts/at91sam9g20.dtsi file. Fix up manually as per Stephen Rothwell. * tag 'staging-3.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (536 commits) Staging: bcm: Remove two unused variables from Adapter.h Staging: bcm: Removes the volatile type definition from Adapter.h Staging: bcm: Rename all "INT" to "int" in Adapter.h Staging: bcm: Fix warning: __packed vs. __attribute__((packed)) in Adapter.h Staging: bcm: Correctly format all comments in Adapter.h Staging: bcm: Fix all whitespace issues in Adapter.h Staging: bcm: Properly format braces in Adapter.h Staging: ipack/bridges/tpci200: remove unneeded casts Staging: ipack/bridges/tpci200: remove TPCI200_SHORTNAME constant Staging: ipack: remove board_name and bus_name fields from struct ipack_device Staging: ipack: improve the register of a bus and a device in the bus. staging: comedi: cleanup all the comedi_driver 'detach' functions staging: comedi: remove all 'default N' in Kconfig staging: line6/config.h: Delete unused header staging: gdm72xx depends on NET staging: gdm72xx: Set up parent link in sysfs for gdm72xx devices staging: drm/omap: initial dmabuf/prime import support staging: drm/omap: dmabuf/prime mmap support pstore/ram: Add ECC support pstore/ram: Switch to persistent_ram routines ...
| * | | | Staging: bcm: Remove two unused variables from Adapter.hKevin McKinney2012-05-191-2/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes two unused variables that are defined in the _MINI_ADAPTER struct. Signed-off-by: Kevin McKinney <klmckinney1@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | Staging: bcm: Removes the volatile type definition from Adapter.hKevin McKinney2012-05-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes the following warning: "Use of volatile is usually wrong: see Documentation/volatile-considered-harmful.txt". There were two variables defined in this manner. Signed-off-by: Kevin McKinney <klmckinney1@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | Staging: bcm: Rename all "INT" to "int" in Adapter.hKevin McKinney2012-05-191-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch renames uppercase "INT" with lowercase "int". Signed-off-by: Kevin McKinney <klmckinney1@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | Staging: bcm: Fix warning: __packed vs. __attribute__((packed)) in Adapter.hKevin McKinney2012-05-191-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the following warning reported by checkpatch.pl: "WARNING: __packed is preferred over __attribute__((packed))". Signed-off-by: Kevin McKinney <klmckinney1@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | Staging: bcm: Correctly format all comments in Adapter.hKevin McKinney2012-05-191-62/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch correctly formats all comments as reported by checkpatch.pl. Signed-off-by: Kevin McKinney <klmckinney1@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | Staging: bcm: Fix all whitespace issues in Adapter.hKevin McKinney2012-05-191-374/+342Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch resolves all whitespace issues as reported by checkpatch.pl. Signed-off-by: Kevin McKinney <klmckinney1@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | Staging: bcm: Properly format braces in Adapter.hKevin McKinney2012-05-191-52/+26Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch cuddles braces as reported by checkpatch.pl. Signed-off-by: Kevin McKinney <klmckinney1@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | Staging: ipack/bridges/tpci200: remove unneeded castsSamuel Iglesias Gonsalvez2012-05-191-20/+16Star
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | Staging: ipack/bridges/tpci200: remove TPCI200_SHORTNAME constantSamuel Iglesias Gonsalvez2012-05-192-39/+33Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Removed TPCI200_SHORTNAME. For the pr_* the name of the module is already included due to pr_fmt declaration. In other cases, KBUILD_MODNAME is used instead. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | Staging: ipack: remove board_name and bus_name fields from struct ipack_deviceSamuel Iglesias Gonsalvez2012-05-193-24/+8Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Removed board_name and bus_name fields from struct ipack_device that are completely useless. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | Staging: ipack: improve the register of a bus and a device in the bus.Samuel Iglesias Gonsalvez2012-05-195-117/+131
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It adds and removes some fields in the struct ipack_device and ipack_bus_device to make it cleaner. The API has change to group all the operations on these structures inside of the ipack driver. Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | staging: comedi: cleanup all the comedi_driver 'detach' functionsH Hartley Sweeten2012-05-19109-993/+236Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Change the return type from int to void All the detach functions, except for the comedi usb drivers, simply return success (0). Plus, the return code is never checked in the comedi core. The comedi usb drivers do return error codes but the conditions can never happen. The first check is: if (!dev) return -EFAULT; This checks that the passed comedi_device pointer is valid. The detach function itself is called using this pointer so it MUST always be valid or there is a bug in the core: if (dev->driver) dev->driver->detach(dev); And the second check: usb = dev->private; if (!usb) return -EFAULT; The dev->private pointer is setup in the attach function to point to the probed usb device. This value could be NULL if the attach fails. But, since the comedi core is going to unload the driver anyway and does not check for errors there is no gain by returning one. After removing these checks from the comedi usb drivers the detach functions required a bit of cleanup. 2. Remove all the printk noise in the detach functions All of the printk output is really just noise. The user did a rmmod to unload the driver, we really don't need to tell them about it. Also, some of the messages are output using: dev_dbg(dev->hw_dev, ... or dev_info(dev->hw_dev, ... Unfortunately the hw_dev value is only used by drivers that are doing DMA. For most drivers this variable is going to be NULL so the output is not going to work as expected. 3. Refactor a couple static 'free_resource' functions into the detach functions. The 'free_resource' function is only being called by the detach and it makes more sense to just absorb the code. 4. Remove a couple unnecessary braces for single statements. 5. Remove unnecessary comments. Most of the comedi drivers appear to be based on the comedi skel driver and have the comments from that driver included. These comments make sense in the skel driver for reference but they don't need to be in any of the actual drivers. 6. Remove all the extra whitespace. It's not needed to make the functions any more readable. 7. Remove the now unused 'attached_successfully' variable in the cb_pcimdda driver. This variable was only used to conditionally output some driver noise during the detach. Since all the printk's have been removed this variable is no longer necessary. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: Ian Abbott <abbotti@mev.co.uk> Cc: Mori Hess <fmhess@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | staging: comedi: remove all 'default N' in KconfigH Hartley Sweeten2012-05-191-135/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove all the 'default N' lines in the comedi Kconfig. They should all be 'default n' but that is the default anyway. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: Ian Abbott <abbotti@mev.co.uk> Cc: Mori Hess <fmhess@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | staging: line6/config.h: Delete unused headerJohannes Thumshirn2012-05-191-43/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Delete unused header file drivers/staging/line6/config.h Signed-off-by: Johannes Thumshirn <morbidrsa@googlemail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | staging: gdm72xx depends on NETRandy Dunlap2012-05-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gdm72xx uses networking interfaces, so it should depend at least on NET (maybe on WIMAX?). ERROR: "sock_release" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "netif_carrier_on" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "netif_carrier_off" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "skb_realloc_headroom" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "netif_rx" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "netlink_kernel_create" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "netif_rx_ni" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "dev_alloc_skb" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "free_netdev" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "register_netdev" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "skb_push" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "dev_get_by_index" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "skb_pull" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "init_net" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "__alloc_skb" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "netlink_broadcast" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "kfree_skb" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "alloc_netdev_mqs" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "eth_type_trans" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "ether_setup" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "unregister_netdev" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "__netif_schedule" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "skb_put" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "sock_wfree" [drivers/staging/gdm72xx/gdmwm.ko] undefined! ERROR: "__nlmsg_put" [drivers/staging/gdm72xx/gdmwm.ko] undefined! Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | staging: gdm72xx: Set up parent link in sysfs for gdm72xx devicesPaul Stewart2012-05-174-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch uses SET_NETDEV_DEV to set up a 'device' parent link in sysfs (e.g. /sys/class/net/wm0/device) for a gdm72xx device. Signed-off-by: Paul Stewart <pstew@chromium.org> Signed-off-by: Ben Chan <benchan@chromium.org> Cc: Sage Ahn <syahn@gctsemi.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | | staging: drm/omap: initial dmabuf/prime import supportRob Clark2012-05-173-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support to re-import omapdrm's own buffers. Importing buffers allocated by other drivers can be added later, but for now is not needed (we don't yet have any other exportering drivers to test with). Signed-off-by: Rob Clark <rob@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>