summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* fsnotify: remove pointless NULL initializersJan Kara2014-01-222-4/+0Star
| | | | | | | | | | | | | We usually rely on the fact that struct members not specified in the initializer are set to NULL. So do that with fsnotify function pointers as well. Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Eric Paris <eparis@parisplace.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fsnotify: remove .should_send_event callbackJan Kara2014-01-224-48/+21Star
| | | | | | | | | | | | | After removing event structure creation from the generic layer there is no reason for separate .should_send_event and .handle_event callbacks. So just remove the first one. Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Eric Paris <eparis@parisplace.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fsnotify: do not share events between notification groupsJan Kara2014-01-2210-611/+279Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently fsnotify framework creates one event structure for each notification event and links this event into all interested notification groups. This is done so that we save memory when several notification groups are interested in the event. However the need for event structure shared between inotify & fanotify bloats the event structure so the result is often higher memory consumption. Another problem is that fsnotify framework keeps path references with outstanding events so that fanotify can return open file descriptors with its events. This has the undesirable effect that filesystem cannot be unmounted while there are outstanding events - a regression for inotify compared to a situation before it was converted to fsnotify framework. For fanotify this problem is hard to avoid and users of fanotify should kind of expect this behavior when they ask for file descriptors from notified files. This patch changes fsnotify and its users to create separate event structure for each group. This allows for much simpler code (~400 lines removed by this patch) and also smaller event structures. For example on 64-bit system original struct fsnotify_event consumes 120 bytes, plus additional space for file name, additional 24 bytes for second and each subsequent group linking the event, and additional 32 bytes for each inotify group for private data. After the conversion inotify event consumes 48 bytes plus space for file name which is considerably less memory unless file names are long and there are several groups interested in the events (both of which are uncommon). Fanotify event fits in 56 bytes after the conversion (fanotify doesn't care about file names so its events don't have to have it allocated). A win unless there are four or more fanotify groups interested in the event. The conversion also solves the problem with unmount when only inotify is used as we don't have to grab path references for inotify events. [hughd@google.com: fanotify: fix corruption preventing startup] Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Eric Paris <eparis@parisplace.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* inotify: provide function for name length roundingJan Kara2014-01-221-20/+21
| | | | | | | | | | | | Rounding of name length when passing it to userspace was done in several places. Provide a function to do it and use it in all places. Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Eric Paris <eparis@parisplace.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge tag 'driver-core-3.14-rc1' of ↵Linus Torvalds2014-01-2117-2760/+3114
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core / sysfs patches from Greg KH: "Here's the big driver core and sysfs patch set for 3.14-rc1. There's a lot of work here moving sysfs logic out into a "kernfs" to allow other subsystems to also have a virtual filesystem with the same attributes of sysfs (handle device disconnect, dynamic creation / removal as needed / unneeded, etc) This is primarily being done for the cgroups filesystem, but the goal is to also move debugfs to it when it is ready, solving all of the known issues in that filesystem as well. The code isn't completed yet, but all should be stable now (there is a big section that was reverted due to problems found when testing) There's also some other smaller fixes, and a driver core addition that allows for a "collection" of objects, that the DRM people will be using soon (it's in this tree to make merges after -rc1 easier) All of this has been in linux-next with no reported issues" * tag 'driver-core-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (113 commits) kernfs: associate a new kernfs_node with its parent on creation kernfs: add struct dentry declaration in kernfs.h kernfs: fix get_active failure handling in kernfs_seq_*() Revert "kernfs: fix get_active failure handling in kernfs_seq_*()" Revert "kernfs: replace kernfs_node->u.completion with kernfs_root->deactivate_waitq" Revert "kernfs: remove KERNFS_ACTIVE_REF and add kernfs_lockdep()" Revert "kernfs: remove KERNFS_REMOVED" Revert "kernfs: restructure removal path to fix possible premature return" Revert "kernfs: invoke kernfs_unmap_bin_file() directly from __kernfs_remove()" Revert "kernfs: remove kernfs_addrm_cxt" Revert "kernfs: make kernfs_get_active() block if the node is deactivated but not removed" Revert "kernfs: implement kernfs_{de|re}activate[_self]()" Revert "kernfs, sysfs, driver-core: implement kernfs_remove_self() and its wrappers" Revert "pci: use device_remove_file_self() instead of device_schedule_callback()" Revert "scsi: use device_remove_file_self() instead of device_schedule_callback()" Revert "s390: use device_remove_file_self() instead of device_schedule_callback()" Revert "sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner()" Revert "kernfs: remove unnecessary NULL check in __kernfs_remove()" kernfs: remove unnecessary NULL check in __kernfs_remove() drivers/base: provide an infrastructure for componentised subsystems ...
| * kernfs: associate a new kernfs_node with its parent on creationTejun Heo2014-01-174-24/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Once created, a kernfs_node is always destroyed by kernfs_put(). Since ba7443bc656e ("sysfs, kernfs: implement kernfs_create/destroy_root()"), kernfs_put() depends on kernfs_root() to locate the ino_ida. kernfs_root() in turn depends on kernfs_node->parent being set for !dir nodes. This means that kernfs_put() of a !dir node requires its ->parent to be initialized. This leads to oops when a newly created !dir node is destroyed without going through kernfs_add_one() or after failing kernfs_add_one() before ->parent is set. kernfs_root() invoked from kernfs_put() will try to dereference NULL parent. Fix it by moving parent association to kernfs_new_node() from kernfs_add_one(). kernfs_new_node() now takes @parent instead of @root and determines the root from the parent and also sets the new node's parent properly. @parent parameter is removed from kernfs_add_one(). As there's no parent when creating the root node, __kernfs_new_node() which takes @root as before and doesn't set the parent is used in that case. This ensures that a kernfs_node in any stage in its life has its parent associated and thus can be put. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * kernfs: fix get_active failure handling in kernfs_seq_*()Tejun Heo2014-01-141-7/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When kernfs_seq_start() fails to obtain an active reference, it returns ERR_PTR(-ENODEV). kernfs_seq_stop() is then invoked with the error pointer value; however, it still proceeds to invoke kernfs_put_active() on the node leading to unbalanced put. If kernfs_seq_stop() is called even after active ref failure, it should skip invocation of @ops->seq_stop() and put_active. Unfortunately, this is a bit complicated because active ref failure isn't the only thing which may fail with ERR_PTR(-ENODEV). @ops->seq_start/next() may also fail with the error value and kernfs_seq_stop() doesn't have a way to tell apart those failures. Work it around by factoring out the active part of kernfs_seq_stop() into kernfs_seq_stop_active() and invoking it directly if @ops->seq_start/next() fail with ERR_PTR(-ENODEV) and updating kernfs_seq_stop() to skip kernfs_seq_stop_active() on ERR_PTR(-ENODEV). This is a bit nasty but ensures that the active put is skipped iff get_active failed in kernfs_seq_start(). tj: This was originally committed as d92d2e6bd72b but got reverted by 683bb2761fbf along with other kernfs self removal patches. However, this one is an independent fix and shouldn't have been reverted together. Reinstate the change. Sorry about the mess. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * Revert "kernfs: fix get_active failure handling in kernfs_seq_*()"Greg Kroah-Hartman2014-01-131-44/+7Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit d92d2e6bd72b653f9811e0c9c46307c743b3fc58. Tejun writes: I'm sorry but can you please revert the whole series? get_active() waiting while a node is deactivated has potential to lead to deadlock and that deactivate/reactivate interface is something fundamentally flawed and that cgroup will have to work with the remove_self() like everybody else. IOW, I think the first posting was correct. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * Revert "kernfs: replace kernfs_node->u.completion with ↵Greg Kroah-Hartman2014-01-131-11/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernfs_root->deactivate_waitq" This reverts commit ea1c472dfeada211a0100daa7976e8e8e779b858. Tejun writes: I'm sorry but can you please revert the whole series? get_active() waiting while a node is deactivated has potential to lead to deadlock and that deactivate/reactivate interface is something fundamentally flawed and that cgroup will have to work with the remove_self() like everybody else. IOW, I think the first posting was correct. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * Revert "kernfs: remove KERNFS_ACTIVE_REF and add kernfs_lockdep()"Greg Kroah-Hartman2014-01-131-20/+11Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit a69d001cfc712b96ec9d7ba44d6285702a38dabf. Tejun writes: I'm sorry but can you please revert the whole series? get_active() waiting while a node is deactivated has potential to lead to deadlock and that deactivate/reactivate interface is something fundamentally flawed and that cgroup will have to work with the remove_self() like everybody else. IOW, I think the first posting was correct. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * Revert "kernfs: remove KERNFS_REMOVED"Greg Kroah-Hartman2014-01-134-60/+42Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit ae34372eb8408b3d07e870f1939f99007a730d28. Tejun writes: I'm sorry but can you please revert the whole series? get_active() waiting while a node is deactivated has potential to lead to deadlock and that deactivate/reactivate interface is something fundamentally flawed and that cgroup will have to work with the remove_self() like everybody else. IOW, I think the first posting was correct. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * Revert "kernfs: restructure removal path to fix possible premature return"Greg Kroah-Hartman2014-01-131-86/+53Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 45a140e587f3d32d8d424ed940dffb61e1739047. Tejun writes: I'm sorry but can you please revert the whole series? get_active() waiting while a node is deactivated has potential to lead to deadlock and that deactivate/reactivate interface is something fundamentally flawed and that cgroup will have to work with the remove_self() like everybody else. IOW, I think the first posting was correct. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * Revert "kernfs: invoke kernfs_unmap_bin_file() directly from __kernfs_remove()"Greg Kroah-Hartman2014-01-133-35/+14Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit f601f9a2bf7dc1f7ee18feece4c4e2fc6845d6c4. Tejun writes: I'm sorry but can you please revert the whole series? get_active() waiting while a node is deactivated has potential to lead to deadlock and that deactivate/reactivate interface is something fundamentally flawed and that cgroup will have to work with the remove_self() like everybody else. IOW, I think the first posting was correct. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * Revert "kernfs: remove kernfs_addrm_cxt"Greg Kroah-Hartman2014-01-134-29/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 99177a34110889a8f2c36420c34e3bcc9bfd8a70. Tejun writes: I'm sorry but can you please revert the whole series? get_active() waiting while a node is deactivated has potential to lead to deadlock and that deactivate/reactivate interface is something fundamentally flawed and that cgroup will have to work with the remove_self() like everybody else. IOW, I think the first posting was correct. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * Revert "kernfs: make kernfs_get_active() block if the node is deactivated ↵Greg Kroah-Hartman2014-01-131-21/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | but not removed" This reverts commit 895a068a524e134900b9d98b519309b7aae7bbb1. Tejun writes: I'm sorry but can you please revert the whole series? get_active() waiting while a node is deactivated has potential to lead to deadlock and that deactivate/reactivate interface is something fundamentally flawed and that cgroup will have to work with the remove_self() like everybody else. IOW, I think the first posting was correct. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * Revert "kernfs: implement kernfs_{de|re}activate[_self]()"Greg Kroah-Hartman2014-01-131-117/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 9f010c2ad5194a4b682e747984477850fabd03be. Tejun writes: I'm sorry but can you please revert the whole series? get_active() waiting while a node is deactivated has potential to lead to deadlock and that deactivate/reactivate interface is something fundamentally flawed and that cgroup will have to work with the remove_self() like everybody else. IOW, I think the first posting was correct. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * Revert "kernfs, sysfs, driver-core: implement kernfs_remove_self() and its ↵Greg Kroah-Hartman2014-01-132-95/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | wrappers" This reverts commit 1ae06819c77cff1ea2833c94f8c093fe8a5c79db. Tejun writes: I'm sorry but can you please revert the whole series? get_active() waiting while a node is deactivated has potential to lead to deadlock and that deactivate/reactivate interface is something fundamentally flawed and that cgroup will have to work with the remove_self() like everybody else. IOW, I think the first posting was correct. Cc: Tejun Heo <tj@kernel.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * Revert "sysfs, driver-core: remove unused ↵Greg Kroah-Hartman2014-01-131-0/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | {sysfs|device}_schedule_callback_owner()" This reverts commit d1ba277e79889085a2faec3b68b91ce89c63f888. Tejun writes: I'm sorry but can you please revert the whole series? get_active() waiting while a node is deactivated has potential to lead to deadlock and that deactivate/reactivate interface is something fundamentally flawed and that cgroup will have to work with the remove_self() like everybody else. IOW, I think the first posting was correct. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * Revert "kernfs: remove unnecessary NULL check in __kernfs_remove()"Greg Kroah-Hartman2014-01-131-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 88533f990c616cf50c2fe585ea03f75c806a293d. Tejun writes: I'm sorry but can you please revert the whole series? get_active() waiting while a node is deactivated has potential to lead to deadlock and that deactivate/reactivate interface is something fundamentally flawed and that cgroup will have to work with the remove_self() like everybody else. IOW, I think the first posting was correct. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * kernfs: remove unnecessary NULL check in __kernfs_remove()Tejun Heo2014-01-121-3/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 895a068a524e ("kernfs: make kernfs_get_active() block if the node is deactivated but not removed") added "struct kernfs_root *root = kernfs_root(kn);" at the head of the function; however, the parameter @kn is checked for later implying that the function may be called with NULL. This means that we may end up invoking kernfs_root() with NULL which will oops. None of the existing users invokes removal with NULL @kn, so this bug doesn't actually trigger. We can relocate kernfs_root() invocation after NULL check; however, allowing NULL param tends to cause more confusion than actually helping anything. As there's no existing user, let's remove the spurious NULL check. This bug was detected by smatch. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner()Tejun Heo2014-01-111-92/+0Star
| | | | | | | | | | | | | | | | | | All device_schedule_callback_owner() users are converted to use device_remove_file_self(). Remove now unused {sysfs|device}_schedule_callback_owner(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * kernfs, sysfs, driver-core: implement kernfs_remove_self() and its wrappersTejun Heo2014-01-102-0/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sometimes it's necessary to implement a node which wants to delete nodes including itself. This isn't straightforward because of kernfs active reference. While a file operation is in progress, an active reference is held and kernfs_remove() waits for all such references to drain before completing. For a self-deleting node, this is a deadlock as kernfs_remove() ends up waiting for an active reference that itself is sitting on top of. This currently is worked around in the sysfs layer using sysfs_schedule_callback() which makes such removals asynchronous. While it works, it's rather cumbersome and inherently breaks synchronicity of the operation - the file operation which triggered the operation may complete before the removal is finished (or even started) and the removal may fail asynchronously. If a removal operation is immmediately followed by another operation which expects the specific name to be available (e.g. removal followed by rename onto the same name), there's no way to make the latter operation reliable. The thing is there's no inherent reason for this to be asynchrnous. All that's necessary to do this synchronous is a dedicated operation which drops its own active ref and deactivates self. This patch implements kernfs_remove_self() and its wrappers in sysfs and driver core. kernfs_remove_self() is to be called from one of the file operations, drops the active ref and deactivates using __kernfs_deactivate_self(), removes the self node, and restores active ref to the dead node using __kernfs_reactivate_self() so that the ref is balanced afterwards. __kernfs_remove() is updated so that it takes an early exit if the target node is already fully removed so that the active ref restored by kernfs_remove_self() after removal doesn't confuse the deactivation path. This makes implementing self-deleting nodes very easy. The normal removal path doesn't even need to be changed to use kernfs_remove_self() for the self-deleting node. The method can invoke kernfs_remove_self() on itself before proceeding the normal removal path. kernfs_remove() invoked on the node by the normal deletion path will simply be ignored. This will replace sysfs_schedule_callback(). A subtle feature of sysfs_schedule_callback() is that it collapses multiple invocations - even if multiple removals are triggered, the removal callback is run only once. An equivalent effect can be achieved by testing the return value of kernfs_remove_self() - only the one which gets %true return value should proceed with actual deletion. All other instances of kernfs_remove_self() will wait till the enclosing kernfs operation which invoked the winning instance of kernfs_remove_self() finishes and then return %false. This trivially makes all users of kernfs_remove_self() automatically show correct synchronous behavior even when there are multiple concurrent operations - all "echo 1 > delete" instances will finish only after the whole operation is completed by one of the instances. v2: For !CONFIG_SYSFS, dummy version kernfs_remove_self() was missing and sysfs_remove_file_self() had incorrect return type. Fix it. Reported by kbuild test bot. v3: Updated to use __kernfs_{de|re}activate_self(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * kernfs: implement kernfs_{de|re}activate[_self]()Tejun Heo2014-01-101-1/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements four functions to manipulate deactivation state - deactivate, reactivate and the _self suffixed pair. A new fields kernfs_node->deact_depth is added so that concurrent and nested deactivations are handled properly. kernfs_node->hash is moved so that it's paired with the new field so that it doesn't increase the size of kernfs_node. A kernfs user's lock would normally nest inside active ref but during removal the user may want to perform kernfs_remove() while holding the said lock, which would introduce a reverse locking dependency. This function can be used to break such reverse dependency by allowing deactivation step to performed separately outside user's critical section. This will also be used implement kernfs_remove_self(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * kernfs: make kernfs_get_active() block if the node is deactivated but not ↵Tejun Heo2014-01-101-4/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | removed Currently, kernfs_get_active() fails if the target node is deactivated. This is fine as a node always gets removed after deactivation; however, we're gonna add reactivation so the assumption won't hold. It'd be incorrect for kernfs_get_active() to fail for a node which was deactivated only temporarily. This patch makes kernfs_get_active() block if the node is deactivated but not removed. If the node gets reactivated (not yet implemented), it will be retried and succeed. If the node gets removed, it will be woken up and fail. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * kernfs: remove kernfs_addrm_cxtTejun Heo2014-01-104-117/+29Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernfs_addrm_cxt and the accompanying kernfs_addrm_start/finish() were added because there were operations which should be performed outside kernfs_mutex after adding and removing kernfs_nodes. The necessary operations were recorded in kernfs_addrm_cxt and performed by kernfs_addrm_finish(); however, after the recent changes which relocated deactivation and unmapping so that they're performed directly during removal, the only operation kernfs_addrm_finish() performs is kernfs_put(), which can be moved inside the removal path too. This patch moves the kernfs_put() of the base ref to __kernfs_remove() and remove kernfs_addrm_cxt and kernfs_addrm_start/finish(). * kernfs_add_one() is updated to grab and release the parent's active ref and kernfs_mutex itself. kernfs_get/put_active() and kernfs_addrm_start/finish() invocations around it are removed from all users. * __kernfs_remove() puts an unlinked node directly instead of chaining it to kernfs_addrm_cxt. Its callers are updated to grab and release kernfs_mutex instead of calling kernfs_addrm_start/finish() around it. v2: Updated to fit the v2 restructuring of removal path. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * kernfs: invoke kernfs_unmap_bin_file() directly from __kernfs_remove()Tejun Heo2014-01-103-14/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernfs_unmap_bin_file() is supposed to unmap all memory mappings of the target file before kernfs_remove() finishes; however, it currently is being called from kernfs_addrm_finish() and has the same race problem as the original implementation of deactivation when there are multiple removers - only the remover which snatches the node to its addrm_cxt->removed list is guaranteed to wait for its completion before returning. It can be fixed by moving kernfs_unmap_bin_file() invocation from kernfs_addrm_finish() to __kernfs_remove(). The function may be called multiple times but that shouldn't do any harm. We end up dropping kernfs_mutex in the removal loop and the node may be removed inbetween by someone else. kernfs_unlink_sibling() is updated to test whether the node has already been removed and return accordingly. __kernfs_remove() in turn performs post-unlinking cleanup only if it actually unlinked the node. KERNFS_HAS_MMAP test is moved out of the unmap function into __kernfs_remove() so that we don't unlock kernfs_mutex unnecessarily. While at it, drop the now meaningless "bin" qualifier from the function name. v2: Rewritten to fit the v2 restructuring of removal path. HAS_MMAP test relocated. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * kernfs: restructure removal path to fix possible premature returnTejun Heo2014-01-101-53/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The recursive nature of kernfs_remove() means that, even if kernfs_remove() is not allowed to be called multiple times on the same node, there may be race conditions between removal of parent and its descendants. While we can claim that kernfs_remove() shouldn't be called on one of the descendants while the removal of an ancestor is in progress, such rule is unnecessarily restrictive and very difficult to enforce. It's better to simply allow invoking kernfs_remove() as the caller sees fit as long as the caller ensures that the node is accessible. The current behavior in such situations is broken. Whoever enters removal path first takes the node off the hierarchy and then deactivates. Following removers either return as soon as it notices that it's not the first one or can't even find the target node as it has already been removed from the hierarchy. In both cases, the following removers may finish prematurely while the nodes which should be removed and drained are still being processed by the first one. This patch restructures so that multiple removers, whether through recursion or direction invocation, always follow the following rules. * When there are multiple concurrent removers, only one puts the base ref. * Regardless of which one puts the base ref, all removers are blocked until the target node is fully deactivated and removed. To achieve the above, removal path now first deactivates the subtree, drains it and then unlinks one-by-one. __kernfs_deactivate() is called directly from __kernfs_removal() and drops and regrabs kernfs_mutex for each descendant to drain active refs. As this means that multiple removers can enter __kernfs_deactivate() for the same node, the function is updated so that it can handle multiple deactivators of the same node - only one actually deactivates but all wait till drain completion. The restructured removal path guarantees that a removed node gets unlinked only after the node is deactivated and drained. Combined with proper multiple deactivator handling, this guarantees that any invocation of kernfs_remove() returns only after the node itself and all its descendants are deactivated, drained and removed. v2: Draining separated into a separate loop (used to be in the same loop as unlink) and done from __kernfs_deactivate(). This is to allow exposing deactivation as a separate interface later. Root node removal was broken in v1 patch. Fixed. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * kernfs: remove KERNFS_REMOVEDTejun Heo2014-01-104-42/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | KERNFS_REMOVED is used to mark half-initialized and dying nodes so that they don't show up in lookups and deny adding new nodes under or renaming it; however, its role overlaps those of deactivation and removal from rbtree. It's necessary to deny addition of new children while removal is in progress; however, this role considerably intersects with deactivation - KERNFS_REMOVED prevents new children while deactivation prevents new file operations. There's no reason to have them separate making things more complex than necessary. KERNFS_REMOVED is also used to decide whether a node is still visible to vfs layer, which is rather redundant as equivalent determination can be made by testing whether the node is on its parent's children rbtree or not. This patch removes KERNFS_REMOVED. * Instead of KERNFS_REMOVED, each node now starts its life deactivated. This means that we now use both atomic_add() and atomic_sub() on KN_DEACTIVATED_BIAS, which is INT_MIN. The compiler generates an overflow warnings when negating INT_MIN as the negation can't be represented as a positive number. Nothing is actually broken but let's bump BIAS by one to avoid the warnings for archs which negates the subtrahend.. * KERNFS_REMOVED tests in add and rename paths are replaced with kernfs_get/put_active() of the target nodes. Due to the way the add path is structured now, active ref handling is done in the callers of kernfs_add_one(). This will be consolidated up later. * kernfs_remove_one() is updated to deactivate instead of setting KERNFS_REMOVED. This removes deactivation from kernfs_deactivate(), which is now renamed to kernfs_drain(). * kernfs_dop_revalidate() now tests RB_EMPTY_NODE(&kn->rb) instead of KERNFS_REMOVED and KERNFS_REMOVED test in kernfs_dir_pos() is dropped. A node which is removed from the children rbtree is not included in the iteration in the first place. This means that a node may be visible through vfs a bit longer - it's now also visible after deactivation until the actual removal. This slightly enlarged window difference doesn't make any difference to the userland. * Sanity check on KERNFS_REMOVED in kernfs_put() is replaced with checks on the active ref. * Some comment style updates in the affected area. v2: Reordered before removal path restructuring. kernfs_active() dropped and kernfs_get/put_active() used instead. RB_EMPTY_NODE() used in the lookup paths. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * kernfs: remove KERNFS_ACTIVE_REF and add kernfs_lockdep()Tejun Heo2014-01-101-11/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There currently are two mechanisms gating active ref lockdep annotations - KERNFS_LOCKDEP flag and KERNFS_ACTIVE_REF type mask. The former disables lockdep annotations in kernfs_get/put_active() while the latter disables all of kernfs_deactivate(). While KERNFS_ACTIVE_REF also behaves as an optimization to skip the deactivation step for non-file nodes, the benefit is marginal and it needlessly diverges code paths. Let's drop KERNFS_ACTIVE_REF and use KERNFS_LOCKDEP in kernfs_deactivate() too. While at it, add a test helper kernfs_lockdep() to test KERNFS_LOCKDEP flag so that it's more convenient and the related code can be compiled out when not enabled. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * kernfs: replace kernfs_node->u.completion with kernfs_root->deactivate_waitqTejun Heo2014-01-101-16/+11Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernfs_node->u.completion is used to notify deactivation completion from kernfs_put_active() to kernfs_deactivate(). We now allow multiple racing removals of the same node and the current removal scheme is no longer correct - kernfs_remove() invocation may return before the node is properly deactivated if it races against another removal. The removal path will be restructured to address the issue. To help such restructure which requires supporting multiple waiters, this patch replaces kernfs_node->u.completion with kernfs_root->deactivate_waitq. This makes deactivation event notifications share a per-root waitqueue_head; however, the wait path is quite cold and this will also allow shaving one pointer off kernfs_node. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * kernfs: fix get_active failure handling in kernfs_seq_*()Tejun Heo2014-01-101-7/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When kernfs_seq_start() fails to obtain an active reference, it returns ERR_PTR(-ENODEV). kernfs_seq_stop() is then invoked with the error pointer value; however, it still proceeds to invoke kernfs_put_active() on the node leading to unbalanced put. If kernfs_seq_stop() is called even after active ref failure, it should skip invocation of @ops->seq_stop() and put_active. Unfortunately, this is a bit complicated because active ref failure isn't the only thing which may fail with ERR_PTR(-ENODEV). @ops->seq_start/next() may also fail with the error value and kernfs_seq_stop() doesn't have a way to tell apart those failures. Work it around by factoring out the active part of kernfs_seq_stop() into kernfs_seq_stop_active() and invoking it directly if @ops->seq_start/next() fail with ERR_PTR(-ENODEV) and updating kernfs_seq_stop() to skip kernfs_seq_stop_active() on ERR_PTR(-ENODEV). This is a bit nasty but ensures that the active put is skipped iff get_active failed in kernfs_seq_start(). Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * Merge 3.13-rc5 into staging-nextGreg Kroah-Hartman2013-12-2427-269/+409
| |\ | | | | | | | | | | | | | | | We want these fixes here to handle some merge issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | kernfs: add kernfs_dir_opsTejun Heo2013-12-172-2/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for mkdir(2), rmdir(2) and rename(2) syscalls. This is implemented through optional kernfs_dir_ops callback table which can be specified on kernfs_create_root(). An implemented callback is invoked when the matching syscall is invoked. As kernfs keep dcache syncs with internal representation and revalidates dentries on each access, the implementation of these methods is extremely simple. Each just discovers the relevant kernfs_node(s) and invokes the requested callback which is allowed to do any kernfs operations and the end result doesn't necessarily have to match the expected semantics of the syscall. This will be used to convert cgroup to use kernfs instead of its own filesystem implementation. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | kernfs: allow negative dentriesTejun Heo2013-12-171-24/+13Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernfs doesn't allow negative dentries - kernfs_iop_lookup() returns ERR_PTR(-ENOENT) instead of NULL which short-circuits negative dentry creation and kernfs's d_delete() callback, kernfs_dop_delete(), returns 1 for all removed nodes. This in turn allows kernfs_dop_revalidate() to assume that there's no negative dentry for kernfs. This worked fine for sysfs but kernfs is scheduled to grow mkdir(2) support which depend on negative dentries. This patch updates so that kernfs allows negative dentries. The required changes are almost trivial - kernfs_iop_lookup() now returns NULL instead of ERR_PTR(-ENOENT) when the target kernfs_node doesn't exist, kernfs_dop_delete() is removed and kernfs_dop_revalidate() is updated to check whether the target dentry is negative and request fresh lookup if so. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | kernfs: update kernfs_rename_ns() to consider KERNFS_STATIC_NAMETejun Heo2013-12-171-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernfs_rename_ns() currently assumes that the target sysfs_dirent has a copied name. This has been okay because sysfs supports rename only for directories which always have copied names; however, there's nothing in kernfs interface which calls for such restriction and currently invoking kernfs_rename_ns() on a regular file leads to oops because it ends up trying to kfree() a static name. This patch updates kernfs_rename_ns() so that it skips kfree() of the old name if it's static. This allows it to be used for all node types. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | kernfs: mark static names with KERNFS_STATIC_NAMETejun Heo2013-12-174-15/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because sysfs used struct attribute which are supposed to stay constant, sysfs didn't copy names when creating regular files. The specified string for name was supposed to stay constant. Such distinction isn't inherent for kernfs. kernfs_create_file[_ns]() should be able to take the same @name as kernfs_create_dir[_ns]() As there can be huge number of sysfs attributes, we still want to be able to use static names for sysfs attributes. This patch renames kernfs_create_file_ns_key() to __kernfs_create_file() and adds @name_is_static parameter so that the caller can explicitly indicate that @name can be used without copying. kernfs is updated to use KERNFS_STATIC_NAME to distinguish static and copied names. This patch doesn't introduce any behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | kernfs: add REMOVED check to create and rename pathsTejun Heo2013-12-171-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernfs currently assumes that the caller doesn't try to create a new node under a removed parent, rename a removed node, or move a node under a removed node. While this works fine for sysfs, it'd be nice to have protection against such cases especially given that kernfs is planned to add support for mkdir, rmdir and rename requsts from userland which may make race conditions more likely. This patch updates create and rename paths to check REMOVED and fail the operation with -ENOENT if performed on or towards removed nodes. Note that remove path already has such check. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | kernfs: add @mode to kernfs_create_dir[_ns]()Tejun Heo2013-12-173-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sysfs assumed 0755 for all newly created directories and kernfs inherited it. This assumption is unnecessarily restrictive and inconsistent with kernfs_create_file[_ns](). This patch adds @mode parameter to kernfs_create_dir[_ns]() and update uses in sysfs accordingly. Among others, this will be useful for implementations of the planned ->mkdir() method. This patch doesn't introduce any behavior differences. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | kernfs: s/sysfs/kernfs/ in internal functions and whatever is leftTejun Heo2013-12-126-251/+254
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernfs has just been separated out from sysfs and we're already in full conflict mode. Nothing can make the situation any worse. Let's take the chance to name things properly. This patch performs the following renames. * s/sysfs_*()/kernfs_*()/ in all internal functions * s/sysfs/kernfs/ in internal strings, comments and whatever is remaining * Uniformly rename various vfs operations so that they're consistently named and distinguishable. This patch is strictly rename only and doesn't introduce any functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | kernfs: s/sysfs/kernfs/ in global variablesTejun Heo2013-12-126-73/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernfs has just been separated out from sysfs and we're already in full conflict mode. Nothing can make the situation any worse. Let's take the chance to name things properly. This patch performs the following renames. * s/sysfs_mutex/kernfs_mutex/ * s/sysfs_dentry_ops/kernfs_dops/ * s/sysfs_dir_operations/kernfs_dir_fops/ * s/sysfs_dir_inode_operations/kernfs_dir_iops/ * s/kernfs_file_operations/kernfs_file_fops/ - renamed for consistency * s/sysfs_symlink_inode_operations/kernfs_symlink_iops/ * s/sysfs_aops/kernfs_aops/ * s/sysfs_backing_dev_info/kernfs_bdi/ * s/sysfs_inode_operations/kernfs_iops/ * s/sysfs_dir_cachep/kernfs_node_cache/ * s/sysfs_ops/kernfs_sops/ This patch is strictly rename only and doesn't introduce any functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | kernfs: s/sysfs/kernfs/ in constantsTejun Heo2013-12-128-45/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernfs has just been separated out from sysfs and we're already in full conflict mode. Nothing can make the situation any worse. Let's take the chance to name things properly. This patch performs the following renames. * s/SYSFS_DIR/KERNFS_DIR/ * s/SYSFS_KOBJ_ATTR/KERNFS_FILE/ * s/SYSFS_KOBJ_LINK/KERNFS_LINK/ * s/SYSFS_{TYPE_FLAGS}/KERNFS_{TYPE_FLAGS}/ * s/SYSFS_FLAG_{FLAG}/KERNFS_{FLAG}/ * s/sysfs_type()/kernfs_type()/ * s/SD_DEACTIVATED_BIAS/KN_DEACTIVATED_BIAS/ This patch is strictly rename only and doesn't introduce any functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | kernfs: s/sysfs/kernfs/ in various data structuresTejun Heo2013-12-127-138/+138
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernfs has just been separated out from sysfs and we're already in full conflict mode. Nothing can make the situation any worse. Let's take the chance to name things properly. This patch performs the following renames. * s/sysfs_open_dirent/kernfs_open_node/ * s/sysfs_open_file/kernfs_open_file/ * s/sysfs_inode_attrs/kernfs_iattrs/ * s/sysfs_addrm_cxt/kernfs_addrm_cxt/ * s/sysfs_super_info/kernfs_super_info/ * s/sysfs_info()/kernfs_info()/ * s/sysfs_open_dirent_lock/kernfs_open_node_lock/ * s/sysfs_open_file_mutex/kernfs_open_file_mutex/ * s/sysfs_of()/kernfs_of()/ This patch is strictly rename only and doesn't introduce any functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | kernfs: drop s_ prefix from kernfs_node membersTejun Heo2013-12-128-178/+174Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernfs has just been separated out from sysfs and we're already in full conflict mode. Nothing can make the situation any worse. Let's take the chance to name things properly. s_ prefix for kernfs members is used inconsistently and a misnomer now. It's not like kernfs_node is used widely across the kernel making the ability to grep for the members particularly useful. Let's just drop the prefix. This patch is strictly rename only and doesn't introduce any functional difference. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | kernfs: s/sysfs_dirent/kernfs_node/ and rename its friends accordinglyTejun Heo2013-12-1212-668/+669
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernfs has just been separated out from sysfs and we're already in full conflict mode. Nothing can make the situation any worse. Let's take the chance to name things properly. This patch performs the following renames. * s/sysfs_elem_dir/kernfs_elem_dir/ * s/sysfs_elem_symlink/kernfs_elem_symlink/ * s/sysfs_elem_attr/kernfs_elem_file/ * s/sysfs_dirent/kernfs_node/ * s/sd/kn/ in kernfs proper * s/parent_sd/parent/ * s/target_sd/target/ * s/dir_sd/parent/ * s/to_sysfs_dirent()/rb_to_kn()/ * misc renames of local vars when they conflict with the above Because md, mic and gpio dig into sysfs details, this patch ends up modifying them. All are sysfs_dirent renames and trivial. While we can avoid these by introducing a dummy wrapping struct sysfs_dirent around kernfs_node, given the limited usage outside kernfs and sysfs proper, I don't think such workaround is called for. This patch is strictly rename only and doesn't introduce any functional difference. - mic / gpio renames were missing. Spotted by kbuild test robot. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Neil Brown <neilb@suse.de> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Ashutosh Dixit <ashutosh.dixit@intel.com> Cc: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | sysfs: fix use-after-free in sysfs_kill_sb()Tejun Heo2013-12-111-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While restructuring the [u]mount path, 4b93dc9b1c68 ("sysfs, kernfs: prepare mount path for kernfs") incorrectly updated sysfs_kill_sb() so that it first kills super_block and then tries to dereference its namespace tag to drop it. Fix it by caching namespace tag before killing the superblock and then drop the cached namespace tag. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Tested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Tested-by: Vlastimil Babka <vbabka@suse.cz> Link: http://lkml.kernel.org/g/20131205031051.GC5135@yliu-dev.sh.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | sysfs: bail early from kernfs_file_mmap() to avoid spurious lockdep warningTejun Heo2013-12-112-8/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is v3.14 fix for the same issue that a8b14744429f ("sysfs: give different locking key to regular and bin files") addresses for v3.13. Due to the extensive kernfs reorganization in v3.14 branch, the same fix couldn't be ported as-is. The v3.13 fix was ignored while merging it into v3.14 branch. 027a485d12e0 ("sysfs: use a separate locking class for open files depending on mmap") assigned different lockdep key to sysfs_open_file->mutex depending on whether the file implements mmap or not in an attempt to avoid spurious lockdep warning caused by merging of regular and bin file paths. While this restored some of the original behavior of using different locks (at least lockdep is concerned) for the different clases of files. The restoration wasn't full because now the lockdep key assignment depends on whether the file has mmap or not instead of whether it's a regular file or not. This means that bin files which don't implement mmap will get assigned the same lockdep class as regular files. This is problematic because file_operations for bin files still implements the mmap file operation and checking whether the sysfs file actually implements mmap happens in the file operation after grabbing @sysfs_open_file->mutex. We still end up adding locking dependency from mmap locking to sysfs_open_file->mutex to the regular file mutex which triggers spurious circular locking warning. For v3.13, a8b14744429f ("sysfs: give different locking key to regular and bin files") fixed it by giving sysfs_open_file->mutex different lockdep keys depending on whether the file is regular or bin instead of whether mmap exists or not; however, due to the way sysfs is now layered behind kernfs, this approach is no longer viable. kernfs can tell whether a sysfs node has mmap implemented or not but can't tell whether a bin file from a regular one. This patch updates kernfs such that kernfs_file_mmap() checks SYSFS_FLAG_HAS_MMAP and bail before grabbing sysfs_open_file->mutex so that it doesn't add spurious locking dependency from mmap to sysfs_open_file->mutex and changes sysfs so that it specifies kernfs_ops->mmap iff the sysfs file implements mmap. Combined, this ensures that sysfs_open_file->mutex is grabbed under mmap path iff the sysfs file actually implements mmap. As sysfs_open_file->mutex is already given a different lockdep key if mmap is implemented, this removes the spurious locking dependency. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Dave Jones <davej@redhat.com> Link: http://lkml.kernel.org/g/20131203184324.GA11320@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | Merge branch 'driver-core-linus' into driver-core-nextTejun Heo2013-12-1017-121/+103Star
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a8b14744429f ("sysfs: give different locking key to regular and bin files") in driver-core-linus modifies sysfs_open_file() so that it gives out different locking classes to sysfs_open_files depending on whether the file is bin or not. Due to the massive kernfs reorganization in driver-core-next, this naturally causes merge conflict in fs/sysfs/file.c. Due to the way things are split between kernfs and sysfs in driver-core-next, the same fix can't easily be applied to driver-core-next. This merge simply ignores the offending commit. A following patch will implement a separate fix for the issue. Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | sysfs, kernfs: remove duplicated include from file.cWei Yongjun2013-12-091-1/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove duplicated include. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | kernfs: implement "trusted.*" xattr supportTejun Heo2013-12-044-11/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kernfs inherited "security.*" xattr support from sysfs. This patch extends xattr support to "trusted.*" using simple_xattr_*(). As trusted xattrs are restricted to CAP_SYS_ADMIN, simple_xattr_*() which uses kernel memory for storage shouldn't be problematic. Note that the existing "security.*" support doesn't implement get/remove/list and the this patch only implements those ops for "trusted.*". We probably want to extend those ops to include support for "security.*". This patch will allow using kernfs from cgroup which requires "trusted.*" xattr support. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: David P. Quigley <dpquigl@tycho.nsa.gov> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
| * | | kernfs: update sysfs_init_inode_attrs()Tejun Heo2013-12-042-37/+29Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sysfs_init_inode_attrs() is a bit clumsy to use requiring the caller to check whether @sd->s_iattr is already set or not. Rename it to sysfs_inode_attrs(), update it to check whether @sd->s_iattr is already initialized before trying to initialize it and return @sd->s_iattr. This simplifies the callers. While at it, * Rename struct sysfs_inode_attrs pointer variables to "attrs". As kernfs no longer deals with "struct attribute", this isn't confusing and makes it easier to distinguish from struct iattr pointers. * A new field will be added to sysfs_inode_attrs. Reindent in preparation. This patch doesn't introduce any behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>