summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'f2fs-for-3.8-rc5' of ↵Linus Torvalds2013-01-2214-131/+189
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs fixes from Jaegeuk Kim: o Support swap file and link generic_file_remap_pages o Enhance the bio streaming flow and free section control o Major bug fix on recovery routine o Minor bug/warning fixes and code cleanups * tag 'f2fs-for-3.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (22 commits) f2fs: use _safe() version of list_for_each f2fs: add comments of start_bidx_of_node f2fs: avoid issuing small bios due to several dirty node pages f2fs: support swapfile f2fs: add remap_pages as generic_file_remap_pages f2fs: add __init to functions in init_f2fs_fs f2fs: fix the debugfs entry creation path f2fs: add global mutex_lock to protect f2fs_stat_list f2fs: remove the blk_plug usage in f2fs_write_data_pages f2fs: avoid redundant time update for parent directory in f2fs_delete_entry f2fs: remove redundant call to set_blocksize in f2fs_fill_super f2fs: move f2fs_balance_fs to punch_hole f2fs: add f2fs_balance_fs in several interfaces f2fs: revisit the f2fs_gc flow f2fs: check return value during recovery f2fs: avoid null dereference in f2fs_acl_from_disk f2fs: initialize newly allocated dnode structure f2fs: update f2fs partition info about SIT/NAT layout f2fs: update f2fs document to reflect SIT/NAT layout correctly f2fs: remove unneeded INIT_LIST_HEAD at few places ...
| * f2fs: use _safe() version of list_for_eachDan Carpenter2013-01-221-4/+3Star
| | | | | | | | | | | | | | | | | | | | | | This is calling list_del() inside a loop which is a problem when we try move to the next item on the list. I've converted it to use the _safe version. And also, as a cleanup, I've converted it to use list_for_each_entry instead of list_for_each. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: add comments of start_bidx_of_nodeJaegeuk Kim2013-01-221-1/+5
| | | | | | | | | | | | | | | | | | The caller of start_bidx_of_node() should give proper node offsets which point only direct node blocks. Otherwise, it is a caller's bug. This patch adds comments to make it clear. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: avoid issuing small bios due to several dirty node pagesJaegeuk Kim2013-01-221-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If some small bios of dirty node pages are supposed to be issued during the sequential data writes, there-in well-produced consecutive data bios are able to be split by the small node bios, resulting in performance degradation. So, let's collect a number of dirty node pages until reaching a threshold. And, by default, I set the threshold as 2MB, a segment size. This improves sequential write performance on i5, 512GB SSD (830 w/ SATA2) as follows. Before: 231 MB/s -> After: 255 MB/s Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
| * f2fs: support swapfileJaegeuk Kim2013-01-221-0/+6
| | | | | | | | | | | | | | This patch adds f2fs_bmap operation to the data address space. This enables f2fs to support swapfile. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: add remap_pages as generic_file_remap_pagesJaegeuk Kim2013-01-221-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was added for all the file systems before. See the following commit. commit id: 0b173bc4daa8f8ec03a85abf5e47b23502ff80af [PATCH] mm: kill vma flag VM_CAN_NONLINEAR This patch moves actual ptes filling for non-linear file mappings into special vma operation: ->remap_pages(). File system must implement this method to get non-linear mappings support, if it uses filemap_fault() then generic_file_remap_pages() can be used. Now device drivers can implement this method and obtain nonlinear vma support." Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: add __init to functions in init_f2fs_fsNamjae Jeon2013-01-225-9/+9
| | | | | | | | | | | | | | | | Add __init to functions in init_f2fs_fs for code consistency. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: fix the debugfs entry creation pathNamjae Jeon2013-01-153-21/+19Star
| | | | | | | | | | | | | | | | | | | | | | | | | | As the "status" debugfs entry will be maintained for entire F2FS filesystem irrespective of the number of partitions. So, we can move the initialization to the init part of the f2fs and destroy will be done from exit part. After making changes, for individual partition mount - entry creation code will not be executed. Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: add global mutex_lock to protect f2fs_stat_listmajianpeng2013-01-151-12/+11Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is an race condition between umounting f2fs and reading f2fs/status, which results in oops. Fox example: Thread A Thread B umount f2fs cat f2fs/status f2fs_destroy_stats() { stat_show() { list_for_each_entry_safe(&f2fs_stat_list) list_del(&si->stat_list); mutex_lock(&si->stat_lock); si->sbi = NULL; mutex_unlock(&si->stat_lock); kfree(sbi->stat_info); } mutex_lock(&si->stat_lock) <- si is gone. ... } Solution with a global lock: f2fs_stat_mutex: Thread A Thread B umount f2fs cat f2fs/status f2fs_destroy_stats() { stat_show() { mutex_lock(&f2fs_stat_mutex); list_del(&si->stat_list); mutex_unlock(&f2fs_stat_mutex); kfree(sbi->stat_info); mutex_lock(&f2fs_stat_mutex); } list_for_each_entry_safe(&f2fs_stat_list) ... mutex_unlock(&f2fs_stat_mutex); } Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> [jaegeuk.kim@samsung.com: fix typos, description, and remove the existing lock] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: remove the blk_plug usage in f2fs_write_data_pagesNamjae Jeon2013-01-151-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's consider the usage of blk_plug in f2fs_write_data_pages(). We can come up with the two issues: lock contention and task awareness. 1. Merging bios prior to grabing "queue lock" The f2fs merges consecutive IOs in the file system level before submitting any bios, which is similar with the back merge by the plugging mechanism in attempt_plug_merge(). Both of them need to acquire no queue lock. 2. Merging policy with respect to tasks The f2fs merges IOs as much as possible regardless of tasks, while blk-plugging is conducted on a basis of tasks. As we can understand there are trade-offs, f2fs tries to maximize the write performance with well-merged bios. As a result, if f2fs produces many consecutive but separated bios in writepages(), it would be good to use blk-plugging since f2fs would be able to avoid queue lock contention in the block layer by merging them. But, f2fs merges IOs and submit one bio, which means that there are not much chances to merge bios by attempt_plug_merge(). However, f2fs has already been used blk_plug by triggering generic_writepages() in f2fs_write_data_pages(). So to make the overall code consistency, I'd like to remove blk_plug there. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: avoid redundant time update for parent directory in f2fs_delete_entryNamjae Jeon2013-01-141-1/+1
| | | | | | | | | | | | | | | | | | | | In call to f2fs_delete_entry, 'dir' time modification code is put at two places. So, remove the redundant code for timing update. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: remove redundant call to set_blocksize in f2fs_fill_superNamjae Jeon2013-01-141-5/+1Star
| | | | | | | | | | | | | | | | | | | | Since, f2fs supports only 4KB blocksize, which is set at the beginning in f2fs_fill_super. So, we do not need to again check this blocksize setting in such case. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: move f2fs_balance_fs to punch_holeJaegeuk Kim2013-01-111-3/+2Star
| | | | | | | | | | | | | | | | | | | | | | The f2fs_fallocate() has two operations: punch_hole and expand_size. Only in the case of punch_hole, dirty node pages can be produced, so let's trigger f2fs_balance_fs() in this case only. Furthermore, let's trigger it at every data truncation routine. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: add f2fs_balance_fs in several interfacesJaegeuk Kim2013-01-114-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The f2fs_balance_fs() is to check the number of free sections and decide whether it needs to conduct cleaning or not. If there are not enough free sections, the cleaning job should be started. In order to control an amount of free sections even under high utilization, f2fs should call f2fs_balance_fs at all the VFS interfaces that are able to produce dirty pages. This patch adds the function calls in the missing interfaces as follows. 1. f2fs_setxattr() The f2fs_setxattr() produces dirty node pages so that we should call f2fs_balance_fs() either likewise doing in other VFS interfaces such as f2fs_lookup(), f2fs_mkdir(), and so on. 2. f2fs_sync_file() We should guarantee serving free sections for syncing metadata during fsync. Previously, there is no space check before triggering checkpoint and sync_node_pages. Therefore, if a bunch of fsync calls are triggered under 100% of FS utilization, f2fs is able to be faced with no free sections, resulting in BUG_ON(). 3. f2fs_sync_fs() Before calling write_checkpoint(), we should guarantee that there are minimum free sections. 4. f2fs_write_inode() f2fs_write_inode() is also able to produce dirty node pages. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: revisit the f2fs_gc flowJaegeuk Kim2013-01-093-41/+23Star
| | | | | | | | | | | | | | | | | | | | | | | | I'd like to revisit the f2fs_gc flow and rewrite as follows. 1. In practical, the nGC parameter of f2fs_gc is meaningless. So, let's remove it. 2. Background GC marks victim blocks as dirty one at a time. 3. Foreground GC should do cleaning job until acquiring enough free sections. Afterwards, it needs to do checkpoint. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: check return value during recoveryJaegeuk Kim2013-01-041-1/+1
| | | | | | | | | | | | | | | | This patch resolves Coverity #753102: >>> No check of the return value of "f2fs_add_link(&dent, inode)". Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: avoid null dereference in f2fs_acl_from_diskJaegeuk Kim2013-01-041-7/+6Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch resolves Coverity #751303: >>> CID 753103: Explicit null dereferenced (FORWARD_NULL) Passing null >>> pointer "value" to function "f2fs_acl_from_disk(char const *, size_t)", which dereferences it. [Error path] - value = NULL; - retval = 0 by f2fs_getxattr(); - f2fs_acl_from_disk(value:NULL, ...); Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: initialize newly allocated dnode structureJaegeuk Kim2013-01-041-1/+1
| | | | | | | | | | | | | | | | | | This patch resolves Coverity #753112. In practical, the existing code flow does not fall into the reported errorneous path. But, anyway, let's avoid this for future. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: update f2fs partition info about SIT/NAT layoutHuajun Li2013-01-041-2/+2
| | | | | | | | | | | | | | Update partition info output under debug FS to reflect segment layout correctly. Signed-off-by: Huajun Li <huajun.li.lee@gmail.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: remove unneeded INIT_LIST_HEAD at few placesNamjae Jeon2013-01-042-2/+0Star
| | | | | | | | | | | | | | | | | | | | While creating a new entry for addition to the list(orphan inode list and fsync inode entry list), there is no need to call HEAD initialization for these entries. So, remove that init part. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: fix time update in case of f2fs fallocateNamjae Jeon2013-01-041-0/+5
| | | | | | | | | | | | | | | | | | | | After doing a punch hole or expanding inode doing fallocation. The change and modification time are not update for the file. So, update time after no issue is observed in fallocate. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * f2fs: introduce f2fs_msg to ease adding information printsNamjae Jeon2013-01-042-17/+65
| | | | | | | | | | | | | | | | | | | | Introduced f2fs_msg function to differentiate f2fs specific messages in the log. And, added few informative prints in the mount path, to convey proper error in case of mount failure. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* | Merge tag 'for-linus-v3.8-rc4' of git://oss.sgi.com/xfs/xfsLinus Torvalds2013-01-177-42/+64
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull xfs bugfixes from Ben Myers: - fix(es) for compound buffers - fix for dquot soft timer asserts due to overflow of d_blk_softlimit - fix for regression in dir v2 code introduced in commit 20f7e9f3726a ("xfs: factor dir2 block read operations") * tag 'for-linus-v3.8-rc4' of git://oss.sgi.com/xfs/xfs: xfs: recalculate leaf entry pointer after compacting a dir2 block xfs: remove int casts from debug dquot soft limit timer asserts xfs: fix the multi-segment log buffer format xfs: fix segment in xfs_buf_item_format_segment xfs: rename bli_format to avoid confusion with bli_formats xfs: use b_maps[] for discontiguous buffers
| * | xfs: recalculate leaf entry pointer after compacting a dir2 blockEric Sandeen2013-01-161-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dave Jones hit this assert when doing a compile on recent git, with CONFIG_XFS_DEBUG enabled: XFS: Assertion failed: (char *)dup - (char *)hdr == be16_to_cpu(*xfs_dir2_data_unused_tag_p(dup)), file: fs/xfs/xfs_dir2_data.c, line: 828 Upon further digging, the tag found by xfs_dir2_data_unused_tag_p(dup) contained "2" and not the proper offset, and I found that this value was changed after the memmoves under "Use a stale leaf for our new entry." in xfs_dir2_block_addname(), i.e. memmove(&blp[mid + 1], &blp[mid], (highstale - mid) * sizeof(*blp)); overwrote it. What has happened is that the previous call to xfs_dir2_block_compact() has rearranged things; it changes btp->count as well as the blp array. So after we make that call, we must recalculate the proper pointer to the leaf entries by making another call to xfs_dir2_block_leaf_p(). Dave provided a metadump image which led to a simple reproducer (create a particular filename in the affected directory) and this resolves the testcase as well as the bug on his live system. Thanks also to dchinner for looking at this one with me. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Tested-by: Dave Jones <davej@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * | xfs: remove int casts from debug dquot soft limit timer assertsBrian Foster2013-01-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The int casts here make it easy to trigger an assert with a large soft limit. For example, set a >4TB soft limit on an empty volume to reproduce a (0 > -x) comparison due to an overflow of d_blk_softlimit. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * | xfs: fix the multi-segment log buffer formatMark Tinguely2013-01-162-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Per Dave Chinner suggestion, this patch: 1) Corrects the detection of whether a multi-segment buffer is still tracking data. 2) Clears all the buffer log formats for a multi-segment buffer. Signed-off-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * | xfs: fix segment in xfs_buf_item_format_segmentMark Tinguely2013-01-161-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Not every segment in a multi-segment buffer is dirty in a transaction and they will not be outputted. The assert in xfs_buf_item_format_segment() that checks for the at least one chunk of data in the segment to be used is not necessary true for multi-segmented buffers. Signed-off-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * | xfs: rename bli_format to avoid confusion with bli_formatsMark Tinguely2013-01-163-24/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename the bli_format structure to __bli_format to avoid accidently confusing them with the bli_formats pointer. Signed-off-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
| * | xfs: use b_maps[] for discontiguous buffersMark Tinguely2013-01-162-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commits starting at 77c1a08 introduced a multiple segment support to xfs_buf. xfs_trans_buf_item_match() could not find a multi-segment buffer in the transaction because it was looking at the single segment block number rather than the multi-segment b_maps[0].bm.bn. This results on a recursive buffer lock that can never be satisfied. This patch: 1) Changed the remaining b_map accesses to be b_maps[0] accesses. 2) Renames the single segment b_map structure to __b_map to avoid future confusion. Signed-off-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
* | | Merge branch 'for_linus' of ↵Linus Torvalds2013-01-162-2/+4
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext3 and udf fixes from Jan Kara: "One ext3 performance regression fix and one udf regression fix (oops on interrupted mount)." * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: UDF: Fix a null pointer dereference in udf_sb_free_partitions jbd: don't wake kjournald unnecessarily
| * | | UDF: Fix a null pointer dereference in udf_sb_free_partitionsNamjae Jeon2013-01-141-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a regression caused by commit bff943af6fe "udf: Fix memory leak when mounting" due to which it was triggering a kernel null point dereference in case of interrupted mount OR when allocating memory to sbi->s_partmaps failed in function udf_sb_alloc_partition_maps. Reported-and-tested-by: James Hogan <james@albanarts.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com> Signed-off-by: Jan Kara <jack@suse.cz>
| * | | jbd: don't wake kjournald unnecessarilyEric Sandeen2013-01-141-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't send an extra wakeup to kjournald in the case where we already have the proper target in j_commit_request, i.e. that commit has already been requested for commit. commit d9b0193 "jbd: fix fsync() tid wraparound bug" changed the logic leading to a wakeup, but it caused some extra wakeups which were found to lead to a measurable performance regression. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>
* | | | vfs: add missing virtual cache flush after editing partial pagesLinus Torvalds2013-01-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Andrew Morton pointed this out a month ago, and then I completely forgot about it. If we read a partial last page of a block device, we will zero out the end of the page, but since that page can then be mapped into user space, we should also make sure to flush the cache on architectures that have virtual caches. We have the flush_dcache_page() function for this, so use it. Now, in practice this really never matters, because nobody sane uses virtual caches to begin with, and they largely exist on old broken RISC arhitectures. And even if you did run on one of those obsolete CPU's, the whole "mmap and access the last partial page of a block device" behavior probably doesn't actually exist. The normal IO functions (read/write) will never see the zeroed-out part of the page that migth not be coherent in the cache, because they honor the size of the device. So I'm marking this for stable (3.7 only), but I'm not sure anybody will ever care. Pointed-out-by: Andrew Morton <akpm@linux-foundation.org> Cc: stable@vger.kernel.org # 3.7 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge tag 'driver-core-3.8-rc3' of ↵Linus Torvalds2013-01-141-1/+1
|\ \ \ \ | |/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg Kroah-Hartman: "Here are two patches for 3.8-rc3. One removes the __dev* defines from init.h now that all usages of it are gone from your tree. The other fix is for debugfs's paramater that was using the wrong base for the option. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" * tag 'driver-core-3.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: debugfs: convert gid= argument from decimal, not octal Remove __dev* markings from init.h
| * | | debugfs: convert gid= argument from decimal, not octalDave Reisner2013-01-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch technically breaks userspace, but I suspect that anyone who actually used this flag would have encountered this brokenness, declared it lunacy, and already sent a patch. Signed-off-by: Dave Reisner <dreisner@archlinux.org> Reviewed-by: Vasiliy Kulikov <segoon@openwall.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* | | | fs/exec.c: work around icc miscompilationXi Wang2013-01-111-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tricky problem is this check: if (i++ >= max) icc (mis)optimizes this check as: if (++i > max) The check now becomes a no-op since max is MAX_ARG_STRINGS (0x7FFFFFFF). This is "allowed" by the C standard, assuming i++ never overflows, because signed integer overflow is undefined behavior. This optimization effectively reverts the previous commit 362e6663ef23 ("exec.c, compat.c: fix count(), compat_count() bounds checking") that tries to fix the check. This patch simply moves ++ after the check. Signed-off-by: Xi Wang <xi.wang@gmail.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | seq_file: fix new kernel-doc warningsRandy Dunlap2013-01-101-1/+1
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix kernel-doc warnings in fs/seq_file.c: Warning(fs/seq_file.c:304): No description found for parameter 'whence' Warning(fs/seq_file.c:304): Excess function parameter 'origin' description in 'seq_lseek' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2013-01-081-1/+3
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) New sysctl ndisc_notify needs some documentation, from Hanns Frederic Sowa. 2) Netfilter REJECT target doesn't set transport header of SKB correctly, from Mukund Jampala. 3) Forcedeth driver needs to check for DMA mapping failures, from Larry Finger. 4) brcmsmac driver can't use usleep_range while holding locks, use udelay instead. From Niels Ole Salscheider. 5) Fix unregister of netlink bridge multicast database handlers, from Vlad Yasevich and Rami Rosen. 6) Fix checksum calculations in netfilter's ipv6 network prefix translation module. 7) Fix high order page allocation failures in netfilter xt_recent, from Eric Dumazet. 8) mac802154 needs to use netif_rx_ni() instead of netif_rx() because mac802154_process_data() can execute in process rather than interrupt context. From Alexander Aring. 9) Fix splice handling of MSG_SENDPAGE_NOTLAST, otherwise we elide one tcp_push() too many. From Eric Dumazet and Willy Tarreau. 10) Fix skb->truesize tracking in XEN netfront driver, from Ian Campbell. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits) xen/netfront: improve truesize tracking ipv4: fix NULL checking in devinet_ioctl() tcp: fix MSG_SENDPAGE_NOTLAST logic net/ipv4/ipconfig: really display the BOOTP/DHCP server's address. ip-sysctl: fix spelling errors mac802154: fix NOHZ local_softirq_pending 08 warning ipv6: document ndisc_notify in networking/ip-sysctl.txt ath9k: Fix Kconfig for ATH9K_HTC netfilter: xt_recent: avoid high order page allocations netfilter: fix missing dependencies for the NOTRACK target netfilter: ip6t_NPT: fix IPv6 NTP checksum calculation bridge: add empty br_mdb_init() and br_mdb_uninit() definitions. vxlan: allow live mac address change bridge: Correctly unregister MDB rtnetlink handlers brcmfmac: fix parsing rsn ie for ap mode. brcmsmac: add copyright information for Canonical rtlwifi: rtl8723ae: Fix warning for unchecked pci_map_single() call rtlwifi: rtl8192se: Fix warning for unchecked pci_map_single() call rtlwifi: rtl8192de: Fix warning for unchecked pci_map_single() call rtlwifi: rtl8192ce: Fix warning for unchecked pci_map_single() call ...
| * | | tcp: fix MSG_SENDPAGE_NOTLAST logicEric Dumazet2013-01-071-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 35f9c09fe9c72e (tcp: tcp_sendpages() should call tcp_push() once) added an internal flag : MSG_SENDPAGE_NOTLAST meant to be set on all frags but the last one for a splice() call. The condition used to set the flag in pipe_to_sendpage() relied on splice() user passing the exact number of bytes present in the pipe, or a smaller one. But some programs pass an arbitrary high value, and the test fails. The effect of this bug is a lack of tcp_push() at the end of a splice(pipe -> socket) call, and possibly very slow or erratic TCP sessions. We should both test sd->total_len and fact that another fragment is in the pipe (pipe->nrbufs > 1) Many thanks to Willy for providing very clear bug report, bisection and test programs. Reported-by: Willy Tarreau <w@1wt.eu> Bisected-by: Willy Tarreau <w@1wt.eu> Tested-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds2013-01-076-70/+90
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull CIFS fixes from Steve French: "Misc small cifs fixes" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: CIFS: Don't let read only caching for mandatory byte-range locked files CIFS: Fix write after setting a read lock for read oplock files Revert "CIFS: Fix write after setting a read lock for read oplock files" cifs: adjust sequence number downward after signing NT_CANCEL request cifs: move check for NULL socket into smb_send_rqst
| * | | | CIFS: Don't let read only caching for mandatory byte-range locked filesPavel Shilovsky2013-01-024-2/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we have mandatory byte-range locks on a file we can't cache reads because pagereading may have conflicts with these locks on the server. That's why we should allow level2 oplocks for files without mandatory locks only. Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
| * | | | CIFS: Fix write after setting a read lock for read oplock filesPavel Shilovsky2013-01-021-28/+20Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we have a read oplock and set a read lock in it, we can't write to the locked area - so, filemap_fdatawrite may fail with a no information for a userspace application even if we request a write to non-locked area. Fix this by writing directly to the server and then breaking oplock level from level2 to None. Also remove CONFIG_CIFS_SMB2 ifdefs because it's suitable for both CIFS and SMB2 protocols. Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
| * | | | Revert "CIFS: Fix write after setting a read lock for read oplock files"Pavel Shilovsky2013-01-023-65/+31Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | that solution has data races and can end up two identical writes to the server: when clientCanCacheAll value can be changed during the execution of __generic_file_aio_write. Signed-off-by: Pavel Shilovsky <piastry@etersoft.ru> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
| * | | | cifs: adjust sequence number downward after signing NT_CANCEL requestJeff Layton2012-12-301-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a call goes out, the signing code adjusts the sequence number upward by two to account for the request and the response. An NT_CANCEL however doesn't get a response of its own, it just hurries the server along to get it to respond to the original request more quickly. Therefore, we must adjust the sequence number back down by one after signing a NT_CANCEL request. Cc: <stable@vger.kernel.org> Reported-by: Tim Perry <tdparmor-sambabugs@yahoo.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
| * | | | cifs: move check for NULL socket into smb_send_rqstJeff Layton2012-12-301-3/+3
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cai reported this oops: [90701.616664] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028 [90701.625438] IP: [<ffffffff814a343e>] kernel_setsockopt+0x2e/0x60 [90701.632167] PGD fea319067 PUD 103fda4067 PMD 0 [90701.637255] Oops: 0000 [#1] SMP [90701.640878] Modules linked in: des_generic md4 nls_utf8 cifs dns_resolver binfmt_misc tun sg igb iTCO_wdt iTCO_vendor_support lpc_ich pcspkr i2c_i801 i2c_core i7core_edac edac_core ioatdma dca mfd_core coretemp kvm_intel kvm crc32c_intel microcode sr_mod cdrom ata_generic sd_mod pata_acpi crc_t10dif ata_piix libata megaraid_sas dm_mirror dm_region_hash dm_log dm_mod [90701.677655] CPU 10 [90701.679808] Pid: 9627, comm: ls Tainted: G W 3.7.1+ #10 QCI QSSC-S4R/QSSC-S4R [90701.688950] RIP: 0010:[<ffffffff814a343e>] [<ffffffff814a343e>] kernel_setsockopt+0x2e/0x60 [90701.698383] RSP: 0018:ffff88177b431bb8 EFLAGS: 00010206 [90701.704309] RAX: ffff88177b431fd8 RBX: 00007ffffffff000 RCX: ffff88177b431bec [90701.712271] RDX: 0000000000000003 RSI: 0000000000000006 RDI: 0000000000000000 [90701.720223] RBP: ffff88177b431bc8 R08: 0000000000000004 R09: 0000000000000000 [90701.728185] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000001 [90701.736147] R13: ffff88184ef92000 R14: 0000000000000023 R15: ffff88177b431c88 [90701.744109] FS: 00007fd56a1a47c0(0000) GS:ffff88105fc40000(0000) knlGS:0000000000000000 [90701.753137] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [90701.759550] CR2: 0000000000000028 CR3: 000000104f15f000 CR4: 00000000000007e0 [90701.767512] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [90701.775465] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [90701.783428] Process ls (pid: 9627, threadinfo ffff88177b430000, task ffff88185ca4cb60) [90701.792261] Stack: [90701.794505] 0000000000000023 ffff88177b431c50 ffff88177b431c38 ffffffffa014fcb1 [90701.802809] ffff88184ef921bc 0000000000000000 00000001ffffffff ffff88184ef921c0 [90701.811123] ffff88177b431c08 ffffffff815ca3d9 ffff88177b431c18 ffff880857758000 [90701.819433] Call Trace: [90701.822183] [<ffffffffa014fcb1>] smb_send_rqst+0x71/0x1f0 [cifs] [90701.828991] [<ffffffff815ca3d9>] ? schedule+0x29/0x70 [90701.834736] [<ffffffffa014fe6d>] smb_sendv+0x3d/0x40 [cifs] [90701.841062] [<ffffffffa014fe96>] smb_send+0x26/0x30 [cifs] [90701.847291] [<ffffffffa015801f>] send_nt_cancel+0x6f/0xd0 [cifs] [90701.854102] [<ffffffffa015075e>] SendReceive+0x18e/0x360 [cifs] [90701.860814] [<ffffffffa0134a78>] CIFSFindFirst+0x1a8/0x3f0 [cifs] [90701.867724] [<ffffffffa013f731>] ? build_path_from_dentry+0xf1/0x260 [cifs] [90701.875601] [<ffffffffa013f731>] ? build_path_from_dentry+0xf1/0x260 [cifs] [90701.883477] [<ffffffffa01578e6>] cifs_query_dir_first+0x26/0x30 [cifs] [90701.890869] [<ffffffffa015480d>] initiate_cifs_search+0xed/0x250 [cifs] [90701.898354] [<ffffffff81195970>] ? fillonedir+0x100/0x100 [90701.904486] [<ffffffffa01554cb>] cifs_readdir+0x45b/0x8f0 [cifs] [90701.911288] [<ffffffff81195970>] ? fillonedir+0x100/0x100 [90701.917410] [<ffffffff81195970>] ? fillonedir+0x100/0x100 [90701.923533] [<ffffffff81195970>] ? fillonedir+0x100/0x100 [90701.929657] [<ffffffff81195848>] vfs_readdir+0xb8/0xe0 [90701.935490] [<ffffffff81195b9f>] sys_getdents+0x8f/0x110 [90701.941521] [<ffffffff815d3b99>] system_call_fastpath+0x16/0x1b [90701.948222] Code: 66 90 55 65 48 8b 04 25 f0 c6 00 00 48 89 e5 53 48 83 ec 08 83 fe 01 48 8b 98 48 e0 ff ff 48 c7 80 48 e0 ff ff ff ff ff ff 74 22 <48> 8b 47 28 ff 50 68 65 48 8b 14 25 f0 c6 00 00 48 89 9a 48 e0 [90701.970313] RIP [<ffffffff814a343e>] kernel_setsockopt+0x2e/0x60 [90701.977125] RSP <ffff88177b431bb8> [90701.981018] CR2: 0000000000000028 [90701.984809] ---[ end trace 24bd602971110a43 ]--- This is likely due to a race vs. a reconnection event. The current code checks for a NULL socket in smb_send_kvec, but that's too late. By the time that check is done, the socket will already have been passed to kernel_setsockopt. Move the check into smb_send_rqst, so that it's checked earlier. In truth, this is a bit of a half-assed fix. The -ENOTSOCK error return here looks like it could bubble back up to userspace. The locking rules around the ssocket pointer are really unclear as well. There are cases where the ssocket pointer is changed without holding the srv_mutex, but I'm not clear whether there's a potential race here yet or not. This code seems like it could benefit from some fundamental re-think of how the socket handling should behave. Until then though, this patch should at least fix the above oops in most cases. Cc: <stable@vger.kernel.org> # 3.7+ Reported-and-Tested-by: CAI Qian <caiqian@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com>
* | | | Merge tag 'ext4_for_linus' of ↵Linus Torvalds2013-01-072-2/+3
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 regression fixes from Ted Ts'o: "Bug fixes, including two regressions introduced in v3.8. The most serious of these regressions is a buffer cache leak." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: remove duplicate call to ext4_bread() in ext4_init_new_dir() ext4: release buffer in failed path in dx_probe() ext4: fix configuration dependencies for ext4 ACLs and security labels
| * | | | ext4: remove duplicate call to ext4_bread() in ext4_init_new_dir()Guo Chao2013-01-071-1/+0Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a buffer cache leak when creating a directory, introduced in commit a774f9c20. Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Tao Ma <boyu.mt@taobao.com>
| * | | | ext4: release buffer in failed path in dx_probe()Guo Chao2013-01-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If checksum fails, we should also release the buffer read from previous iteration. Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>- Cc: stable@vger.kernel.org -- fs/ext4/namei.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
| * | | | ext4: fix configuration dependencies for ext4 ACLs and security labelsValerie Aurora2013-01-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit "ext4: Remove CONFIG_EXT4_FS_XATTR" removed the configuration dependencies for ext4 xattrs from the ext4 ACLs and security labels configuration options, but did not replace them with a dependency on ext4 itself. Add back the dependency on ext4 so the options only show up if ext4 is enabled. Signed-off-by: Valerie Aurora <val@vaaconsulting.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Tao Ma <boyu.mt@taobao.com>
* | | | | Merge tag 'nfs-for-3.8-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2013-01-077-22/+38
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client bugfixes from Trond Myklebust: - Fix a permissions problem when opening NFSv4 files that only have the exec bit set. - Fix a couple of typos in pNFS (inverted logic), and the mount parsing (missing pointer dereference). - Work around a series of deadlock issues due to workqueues using struct work_struct pointer address comparisons in the re-entrancy tests. Ensure that we don't free struct work_struct prematurely if our work function involves waiting for completion of other work items (e.g. by calling rpc_shutdown_client). - Revert the part of commit 168e4b3 that is causing unnecessary warnings to be issued in the nfsd callback code. * tag 'nfs-for-3.8-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: nfs: avoid dereferencing null pointer in initiate_bulk_draining SUNRPC: Partial revert of commit 168e4b39d1afb79a7e3ea6c3bb246b4c82c6bdb9 NFS: Ensure that we free the rpc_task after read and write cleanups are done SUNRPC: Ensure that we free the rpc_task after cleanups are done nfs: fix null checking in nfs_get_option_str() pnfs: Increase the refcount when LAYOUTGET fails the first time NFS: Fix access to suid/sgid executables