openslx/kernel-qcow2-linux.git - In-kernel qcow2 (Kernel part)

	Commit message (Collapse)	Author	Age	Files	Lines
*	GFS2: Fix glock queue trace point	Steven Whitehouse	2011-01-31	1	-1/+1
\| \| \| \| \| \| \|	Somehow this tracepoint landed up in the wrong place. This moves it to where it should be. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
*	GFS2: Post-VFS scale update for RCU path walk	Steven Whitehouse	2011-01-21	2	-7/+10
\| \| \| \| \| \| \| \| \| \|	We can allow a few more cases to use RCU path walking than originally allowed. It should be possible to also enable RCU path walking when the glock is already cached. Thats a bit more complicated though, so left for a future patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Nick Piggin <npiggin@gmail.com>
*	GFS2: Use RCU for glock hash table	Steven Whitehouse	2011-01-21	8	-297/+190
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This has a number of advantages: - Reduces contention on the hash table lock - Makes the code smaller and simpler - Should speed up glock dumps when under load - Removes ref count changing in examine_bucket - No longer need hash chain lock in glock_put() in common case There are some further changes which this enables and which we may do in the future. One is to look at using SLAB_RCU, and another is to look at using a per-cpu counter for the per-sb glock counter, since that is touched twice in the lifetime of each glock (but only used at umount time). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
*	Merge branch 'akpm'	Linus Torvalds	2011-01-21	4	-8/+12
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* akpm: kernel/smp.c: consolidate writes in smp_call_function_interrupt() kernel/smp.c: fix smp_call_function_many() SMP race memcg: correctly order reading PCG_USED and pc->mem_cgroup backlight: fix 88pm860x_bl macro collision drivers/leds/ledtrig-gpio.c: make output match input, tighten input checking MAINTAINERS: update Atmel AT91 entry mm: fix truncate_setsize() comment memcg: fix rmdir, force_empty with THP memcg: fix LRU accounting with THP memcg: fix USED bit handling at uncharge in THP memcg: modify accounting function for supporting THP better fs/direct-io.c: don't try to allocate more than BIO_MAX_PAGES in a bio mm: compaction: prevent division-by-zero during user-requested compaction mm/vmscan.c: remove duplicate include of compaction.h memblock: fix memblock_is_region_memory() thp: keep highpte mapped until it is no longer needed kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT
\| *	fs/direct-io.c: don't try to allocate more than BIO_MAX_PAGES in a bio	David Dillow	2011-01-21	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When using devices that support max_segments > BIO_MAX_PAGES (256), direct IO tries to allocate a bio with more pages than allowed, which leads to an oops in dio_bio_alloc(). Clamp the request to the supported maximum, and change dio_bio_alloc() to reflect that bio_alloc() will always return a bio when called with __GFP_WAIT and a valid number of vectors. [akpm@linux-foundation.org: remove redundant BUG_ON()] Signed-off-by: David Dillow <dillowda@ornl.gov> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
\| *	kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT	David Rientjes	2011-01-21	3	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The meaning of CONFIG_EMBEDDED has long since been obsoleted; the option is used to configure any non-standard kernel with a much larger scope than only small devices. This patch renames the option to CONFIG_EXPERT in init/Kconfig and fixes references to the option throughout the kernel. A new CONFIG_EMBEDDED option is added that automatically selects CONFIG_EXPERT when enabled and can be used in the future to isolate options that should only be considered for embedded systems (RISC architectures, SLOB, etc). Calling the option "EXPERT" more accurately represents its intention: only expert users who understand the impact of the configuration changes they are making should enable it. Reviewed-by: Ingo Molnar <mingo@elte.hu> Acked-by: David Woodhouse <david.woodhouse@intel.com> Signed-off-by: David Rientjes <rientjes@google.com> Cc: Greg KH <gregkh@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jens Axboe <axboe@kernel.dk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Robin Holt <holt@sgi.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	Merge branch 'master' of ↵	Linus Torvalds	2011-01-21	13	-381/+455
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: cifs: mangle existing header for SMB_COM_NT_CANCEL cifs: remove code for setting timeouts on requests [CIFS] cifs: reconnect unresponsive servers cifs: set up recurring workqueue job to do SMB echo requests cifs: add ability to send an echo request cifs: add cifs_call_async cifs: allow for different handling of received response cifs: clean up sync_mid_result cifs: don't reconnect server when we don't get a response cifs: wait indefinitely for responses cifs: Use mask of ACEs for SID Everyone to calculate all three permissions user, group, and other cifs: Fix regression during share-level security mounts (Repost) [CIFS] Update cifs version number cifs: move mid result processing into common function cifs: move locked sections out of DeleteMidQEntry and AllocMidQEntry cifs: clean up accesses to midCount cifs: make wait_for_free_request take a TCP_Server_Info pointer cifs: no need to mark smb_ses_list as cifs_demultiplex_thread is exiting cifs: don't fail writepages on -EAGAIN errors CIFS: Fix oplock break handling (try #2)
\| * \|	cifs: mangle existing header for SMB_COM_NT_CANCEL	Jeff Layton	2011-01-20	1	-25/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The NT_CANCEL command looks just like the original command, except for a few small differences. The send_nt_cancel function however currently takes a tcon, which we don't have in SendReceive and SendReceive2. Instead of "respinning" the entire header for an NT_CANCEL, just mangle the existing header by replacing just the fields we need. This means we don't need a tcon and allows us to call it from other places. Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com> Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
\| * \|	cifs: remove code for setting timeouts on requests	Jeff Layton	2011-01-20	6	-50/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since we don't time out individual requests anymore, remove the code that we used to use for setting timeouts on different requests. Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com> Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
\| * \|	[CIFS] cifs: reconnect unresponsive servers	Steve French	2011-01-20	3	-5/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the server isn't responding to echoes, we don't want to leave tasks hung waiting for it to reply. At that point, we'll want to reconnect so that soft mounts can return an error to userspace quickly. If the client hasn't received a reply after a specified number of echo intervals, assume that the transport is down and attempt to reconnect the socket. The number of echo_intervals to wait before attempting to reconnect is tunable via a module parameter. Setting it to 0, means that the client will never attempt to reconnect. The default is 5. Signed-off-by: Jeff Layton <jlayton@redhat.com>
\| * \|	cifs: set up recurring workqueue job to do SMB echo requests	Jeff Layton	2011-01-20	2	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
\| * \|	cifs: add ability to send an echo request	Jeff Layton	2011-01-20	4	-1/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
\| * \|	cifs: add cifs_call_async	Jeff Layton	2011-01-20	2	-1/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a function that will send a request, and set up the mid for an async reply. Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
\| * \|	cifs: allow for different handling of received response	Jeff Layton	2011-01-20	4	-35/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to incorporate async requests, we need to allow for a more general way to do things on receive, rather than just waking up a process. Turn the task pointer in the mid_q_entry into a callback function and a generic data pointer. When a response comes in, or the socket is reconnected, cifsd can call the callback function in order to wake up the process. The default is to just wake up the current process which should mean no change in behavior for existing code. Also, clean up the locking in cifs_reconnect. There doesn't seem to be any need to hold both the srv_mutex and GlobalMid_Lock when walking the list of mids. Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
\| * \|	cifs: clean up sync_mid_result	Jeff Layton	2011-01-20	1	-17/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make it use a switch statement based on the value of the midStatus. If the resp_buf is set, then MID_RESPONSE_RECEIVED is too. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
\| * \|	cifs: don't reconnect server when we don't get a response	Jeff Layton	2011-01-20	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We only want to force a reconnect to the server under very limited and specific circumstances. Now that we have processes waiting indefinitely for responses, we shouldn't reach this point unless a reconnect is already in process. Thus, there's no reason to re-mark the server for reconnect here. Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
\| * \|	cifs: wait indefinitely for responses	Jeff Layton	2011-01-20	1	-93/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The client should not be timing out on individual SMB requests. Too much of the state between client and server is tied to the state of the socket. If we time out requests and issue spurious disconnects then that comprimises data integrity. Instead of doing this complicated dance where we try to decide how long to wait for a response for particular requests, have the client instead wait indefinitely for a response. Also, use a TASK_KILLABLE sleep here so that fatal signals will break out of this waiting. Later patches will add support for detecting dead peers and forcing reconnects based on that. Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
\| * \|	cifs: Use mask of ACEs for SID Everyone to calculate all three permissions ↵	Shirish Pargaonkar	2011-01-19	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	user, group, and other If a DACL has entries for ACEs for SID Everyone and Authenticated Users, factor in mask in respective entries during calculation of permissions for all three, user, group, and other. http://technet.microsoft.com/en-us/library/bb463216.aspx Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
\| * \|	cifs: Fix regression during share-level security mounts (Repost)	Shirish Pargaonkar	2011-01-19	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	NTLM response length was changed to 16 bytes instead of 24 bytes that are sent in Tree Connection Request during share-level security share mounts. Revert it back to 24 bytes. Reported-and-Tested-by: Grzegorz Ozanski <grzegorz.ozanski@intel.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Acked-by: Suresh Jayaraman <sjayaraman@suse.de> Cc: stable@kernel.org Signed-off-by: Steve French <sfrench@us.ibm.com>
\| * \|	[CIFS] Update cifs version number	Steve French	2011-01-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Steve French <sfrench@us.ibm.com>
\| * \|	cifs: move mid result processing into common function	Jeff Layton	2011-01-19	1	-78/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
\| * \|	cifs: move locked sections out of DeleteMidQEntry and AllocMidQEntry	Jeff Layton	2011-01-19	1	-17/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In later patches, we're going to need to have finer-grained control over the addition and removal of these structs from the pending_mid_q and we'll need to be able to call the destructor while holding the spinlock. Move the locked sections out of both routines and into the callers. Fix up current callers of DeleteMidQEntry to call a new routine that dequeues the entry and then destroys it. Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
\| * \|	cifs: clean up accesses to midCount	Jeff Layton	2011-01-19	3	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's an atomic_t and the code accesses the "counter" field in it directly instead of using atomic_read(). It also is sometimes accessed under a spinlock and sometimes not. Move it out of the spinlock since we don't need belt-and-suspenders for something that's just informational. Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
\| * \|	cifs: make wait_for_free_request take a TCP_Server_Info pointer	Jeff Layton	2011-01-19	1	-13/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The cifsSesInfo pointer is only used to get at the server. Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
\| * \|	cifs: no need to mark smb_ses_list as cifs_demultiplex_thread is exiting	Jeff Layton	2011-01-19	1	-41/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The TCP_Server_Info is refcounted and every SMB session holds a reference to it. Thus, smb_ses_list is always going to be empty when cifsd is coming down. This is dead code. Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
\| * \|	cifs: don't fail writepages on -EAGAIN errors	Jeff Layton	2011-01-19	1	-12/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If CIFSSMBWrite2 returns -EAGAIN, then the error should be considered temporary. CIFS should retry the write instead of setting an error on the mapping and returning. For WB_SYNC_ALL, just retry the write immediately. In the WB_SYNC_NONE case, call redirty_page_for_writeback on all of the pages that didn't get written out and then move on. Also, fix up the handling of a short write with a successful return code. MS-CIFS says that 0 bytes_written means ENOSPC or EFBIG. It doesn't mention what a short, but non-zero write means, so for now treat it as we would an -EAGAIN return. Reviewed-by: Suresh Jayaraman <sjayaraman@suse.de> Reviewed-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
\| * \|	CIFS: Fix oplock break handling (try #2)	Pavel Shilovsky	2011-01-19	4	-13/+16
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we get oplock break notification we should set the appropriate value of OplockLevel field in oplock break acknowledge according to the oplock level held by the client in this time. As we only can have level II oplock or no oplock in the case of oplock break, we should be aware only about clientCanCacheRead field in cifsInodeInfo structure. Also fix bug connected with wrong interpretation of OplockLevel field during oplock break notification processing. Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com> Cc: <stable@kernel.org> Signed-off-by: Steve French <sfrench@us.ibm.com>
* \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes	Linus Torvalds	2011-01-21	3	-51/+23
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes: GFS2: Fix error path in gfs2_lookup_by_inum() GFS2: remove iopen glocks from cache on failed deletes
\| * \|	GFS2: Fix error path in gfs2_lookup_by_inum()	Steven Whitehouse	2011-01-18	2	-51/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the (impossible, except if there is fs corruption) error path in gfs2_lookup_by_inum() if the call to gfs2_inode_refresh() fails, it was leaving the function by calling iput() rather than iget_failed(). This would cause future lookups of the same inode to block forever. This patch fixes the problem by moving the call to gfs2_inode_refresh() into gfs2_inode_lookup() where iget_failed() is part of the error path already. Also this cleans up some unreachable code and makes gfs2_set_iop() static. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
\| * \|	GFS2: remove iopen glocks from cache on failed deletes	Benjamin Marzinski	2011-01-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a file gets deleted on GFS2, if a node can't get an exclusive lock on the file's iopen glock, it punts on actually freeing up the space, because another node is using the file. When it does this, it needs to drop the iopen glock from its cache so that the other node can get an exclusive lock on it. Now, gfs2_delete_inode() sets GL_NOCACHE before dropping the shared lock on the iopen glock in preparation for grabbing it in the exclusive state. Since the node needs the glock in the exclusive state, dropping the shared lock from the cache doesn't slow down the case where no other nodes are using the file. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* \| \|	Fix broken "pipe: use event aware wakeups" optimization	Linus Torvalds	2011-01-21	1	-5/+5
\| \|/ \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit e462c448fdc8 ("pipe: use event aware wakeups") optimized the pipe event wakeup calls to avoid wakeups if the events do not match the requested set. However, the optimization was buggy, in that it didn't actually use the correct sets for the events: when we make room for more data to be written, the pipe poll() routine will return both the POLLOUT _and_ POLLWRNORM bits. Similarly for read. And most critically, when a pipe is released, that will potentially result in POLLHUP\|POLLERR (depending on whether it was the last reader or writer), not just the regular POLLIN\|POLLOUT. This bug showed itself as a hung gnome-screensaver-dialog process, stuck forever (or at least until it was poked by a signal or by being traced) in a poll() system call. Cc: Davide Libenzi <davidel@xmailserver.org> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \|	autofs4: clean ->d_release() and autofs4_free_ino() up	Al Viro	2011-01-18	3	-19/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The latter is called only when both ino and dentry are about to be freed, so cleaning ->d_fsdata and ->dentry is pointless. Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* \|	autofs4: split autofs4_init_ino()	Al Viro	2011-01-18	3	-26/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	split init_ino into new_ino and clean_ino; the former is what used to be init_ino(NULL, sbi), the latter is for cases where we passed non-NULL ino. Lose unused arguments. Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* \|	autofs4: mkdir and symlink always get a dentry that had passed lookup	Al Viro	2011-01-18	1	-18/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	... so ->d_fsdata will have been set up before we get there Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* \|	autofs4: autofs4_get_inode() doesn't need autofs_info * argument anymore	Al Viro	2011-01-18	3	-7/+5
\| \| \| \| \| \| \| \| \| \|	Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* \|	autofs4: kill ->size in autofs_info	Al Viro	2011-01-18	3	-6/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's used only to pass the length of symlink body to autofs4_get_inode() in autofs4_dir_symlink(). We can bloody well set inode->i_size in autofs4_dir_symlink() directly and be done with that. Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* \|	autofs4: pass mode to autofs4_get_inode() explicitly	Al Viro	2011-01-18	3	-16/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In all cases we'd set inf->mode to know value just before passing it to autofs4_get_inode(). That kills the need to store it in autofs_info and pass it to autofs_init_ino() Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* \|	autofs4: autofs4_mkroot() is not different from autofs4_init_ino()	Al Viro	2011-01-18	1	-12/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Kill it. Mind you, it's been an obfuscated call of autofs4_init_ino() ever since 2.3.99pre6-4... Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* \|	autofs4: keep symlink body in inode->i_private	Al Viro	2011-01-18	4	-28/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	gets rid of all ->free()/->u.symlink machinery in autofs; we simply keep symlink bodies in inode->i_private and free them in ->evict_inode(). Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* \|	autofs4 - fix debug print in autofs4_lookup()	Ian Kent	2011-01-18	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	oz_mode isn't defined any more, use autofs4_oz_mode(sbi) instead. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* \|	vfs - fix dentry ref count in do_lookup()	Ian Kent	2011-01-18	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is a ref count problem in fs/namei.c:do_lookup(). When walking in ref-walk mode, if follow_managed() returns a fail we need to drop dentry and possibly vfsmount. Clean up properly, as we do in the other caller of follow_managed(). Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* \|	autofs4 - fix get_next_positive_dentry()	Ian Kent	2011-01-18	1	-2/+2
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \|	The initialization condition in fs/autofs4/expire.c:get_next_positive_dentry() appears to be incorrect. If prev == NULL I believe that root should be returned. Further down, at the current dentry check for it being simple_positive() it looks like the d_lock for dentry p should be dropped instead of dentry ret, otherwise when p is assinged to ret we end up with no lock on p and a lost lock on ret, which leads to a deadlock. Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	Merge branch 'for-linus' of ↵	Linus Torvalds	2011-01-17	29	-623/+2490
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (25 commits) Btrfs: forced readonly mounts on errors btrfs: Require CAP_SYS_ADMIN for filesystem rebalance Btrfs: don't warn if we get ENOSPC in btrfs_block_rsv_check btrfs: Fix memory leak in btrfs_read_fs_root_no_radix() btrfs: check NULL or not btrfs: Don't pass NULL ptr to func that may deref it. btrfs: mount failure return value fix btrfs: Mem leak in btrfs_get_acl() btrfs: fix wrong free space information of btrfs btrfs: make the chunk allocator utilize the devices better btrfs: restructure find_free_dev_extent() btrfs: fix wrong calculation of stripe size btrfs: try to reclaim some space when chunk allocation fails btrfs: fix wrong data space statistics fs/btrfs: Fix build of ctree Btrfs: fix off by one while setting block groups readonly Btrfs: Add BTRFS_IOC_SUBVOL_GETFLAGS/SETFLAGS ioctls Btrfs: Add readonly snapshots support Btrfs: Refactor btrfs_ioctl_snap_create() btrfs: Extract duplicate decompress code ...
\| *	Btrfs: forced readonly mounts on errors	liubo	2011-01-17	7	-2/+523
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch comes from "Forced readonly mounts on errors" ideas. As we know, this is the first step in being more fault tolerant of disk corruptions instead of just using BUG() statements. The major content: - add a framework for generating errors that should result in filesystems going readonly. - keep FS state in disk super block. - make sure that all of resource will be freed and released at umount time. - make sure that fter FS is forced readonly on error, there will be no more disk change before FS is corrected. For this, we should stop write operation. After this patch is applied, the conversion from BUG() to such a framework can happen incrementally. Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
\| *	btrfs: Require CAP_SYS_ADMIN for filesystem rebalance	Ben Hutchings	2011-01-16	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Filesystem rebalancing (BTRFS_IOC_BALANCE) affects the entire filesystem and may run uninterruptibly for a long time. This does not seem to be something that an unprivileged user should be able to do. Reported-by: Aron Xu <happyaron.xu@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Chris Mason <chris.mason@oracle.com>
\| *	Btrfs: don't warn if we get ENOSPC in btrfs_block_rsv_check	Josef Bacik	2011-01-16	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we run low on space we could get a bunch of warnings out of btrfs_block_rsv_check, but this is mostly just called via the transaction code to see if we need to end the transaction, it expects to see failures, so let's not WARN and freak everybody out for no reason. Thanks, Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
\| *	btrfs: Fix memory leak in btrfs_read_fs_root_no_radix()	Tsutomu Itoh	2011-01-16	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In btrfs_read_fs_root_no_radix(), 'root' is not freed if btrfs_search_slot() returns error. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
\| *	btrfs: check NULL or not	Tsutomu Itoh	2011-01-16	3	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Should check if functions returns NULL or not. Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
\| *	btrfs: Don't pass NULL ptr to func that may deref it.	Jesper Juhl	2011-01-16	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Hi, In fs/btrfs/inode.c::fixup_tree_root_location() we have this code: ... if (!path) { err = -ENOMEM; goto out; } ... out: btrfs_free_path(path); return err; btrfs_free_path() passes its argument on to other functions and some of them end up dereferencing the pointer. In the code above that pointer is clearly NULL, so btrfs_free_path() will eventually cause a NULL dereference. There are many ways to cut this cake (fix the bug). The one I chose was to make btrfs_free_path() deal gracefully with NULL pointers. If you disagree, feel free to come up with an alternative patch. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Chris Mason <chris.mason@oracle.com>
\| *	btrfs: mount failure return value fix	Dave Young	2011-01-16	2	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I happened to pass swap partition as root partition in cmdline, then kernel panic and tell me about "Cannot open root device". It is not correct, in fact it is a fs type mismatch instead of 'no device'. Eventually I found btrfs mounting failed with -EIO, it should be -EINVAL. The logic in init/do_mounts.c: for (p = fs_names; *p; p += strlen(p)+1) { int err = do_mount_root(name, p, flags, root_mount_data); switch (err) { case 0: goto out; case -EACCES: flags \|= MS_RDONLY; goto retry; case -EINVAL: continue; } print "Cannot open root device" panic } SO fs type after btrfs will have no chance to mount Here fix the return value as -EINVAL Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>