summaryrefslogtreecommitdiffstats
path: root/fs/f2fs/data.c
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'ext4_for_linus' of ↵Linus Torvalds2016-12-141-1/+3
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "This merge request includes the dax-4.0-iomap-pmd branch which is needed for both ext4 and xfs dax changes to use iomap for DAX. It also includes the fscrypt branch which is needed for ubifs encryption work as well as ext4 encryption and fscrypt cleanups. Lots of cleanups and bug fixes, especially making sure ext4 is robust against maliciously corrupted file systems --- especially maliciously corrupted xattr blocks and a maliciously corrupted superblock. Also fix ext4 support for 64k block sizes so it works well on ppcle. Fixed mbcache so we don't miss some common xattr blocks that can be merged" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (86 commits) dax: Fix sleep in atomic contex in grab_mapping_entry() fscrypt: Rename FS_WRITE_PATH_FL to FS_CTX_HAS_BOUNCE_BUFFER_FL fscrypt: Delay bounce page pool allocation until needed fscrypt: Cleanup page locking requirements for fscrypt_{decrypt,encrypt}_page() fscrypt: Cleanup fscrypt_{decrypt,encrypt}_page() fscrypt: Never allocate fscrypt_ctx on in-place encryption fscrypt: Use correct index in decrypt path. fscrypt: move the policy flags and encryption mode definitions to uapi header fscrypt: move non-public structures and constants to fscrypt_private.h fscrypt: unexport fscrypt_initialize() fscrypt: rename get_crypt_info() to fscrypt_get_crypt_info() fscrypto: move ioctl processing more fully into common code fscrypto: remove unneeded Kconfig dependencies MAINTAINERS: fscrypto: recommend linux-fsdevel for fscrypto patches ext4: do not perform data journaling when data is encrypted ext4: return -ENOMEM instead of success ext4: reject inodes with negative size ext4: remove another test in ext4_alloc_file_blocks() Documentation: fix description of ext4's block_validity mount option ext4: fix checks for data=ordered and journal_async_commit options ...
| * fscrypt: Cleanup page locking requirements for fscrypt_{decrypt,encrypt}_page()David Gstir2016-12-111-1/+0Star
| | | | | | | | | | | | | | | | | | | | Rename the FS_CFLG_INPLACE_ENCRYPTION flag to FS_CFLG_OWN_PAGES which, when set, indicates that the fs uses pages under its own control as opposed to writeback pages which require locking and a bounce buffer for encryption. Signed-off-by: David Gstir <david@sigma-star.at> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
| * fscrypt: Let fs select encryption index/tweakDavid Gstir2016-11-141-2/+3
| | | | | | | | | | | | | | | | | | | | | | Avoid re-use of page index as tweak for AES-XTS when multiple parts of same page are encrypted. This will happen on multiple (partial) calls of fscrypt_encrypt_page on same page. page->index is only valid for writeback pages. Signed-off-by: David Gstir <david@sigma-star.at> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
| * fscrypt: Enable partial page encryptionDavid Gstir2016-11-141-0/+2
| | | | | | | | | | | | | | | | | | Not all filesystems work on full pages, thus we should allow them to hand partial pages to fscrypt for en/decryption. Signed-off-by: David Gstir <david@sigma-star.at> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | Merge tag 'for-f2fs-4.10' of ↵Linus Torvalds2016-12-141-65/+127
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "This patch series contains several performance tuning patches regarding to the IO submission flow, in addition to supporting new features such as a ZBC-base drive and multiple devices. It also includes some major bug fixes such as: - checkpoint version control - fdatasync-related roll-forward recovery routine - memory boundary or null-pointer access in corner cases - missing error cases It has various minor clean-up patches as well" * tag 'for-f2fs-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (66 commits) f2fs: fix a missing size change in f2fs_setattr f2fs: fix to access nullified flush_cmd_control pointer f2fs: free meta pages if sanity check for ckpt is failed f2fs: detect wrong layout f2fs: call sync_fs when f2fs is idle Revert "f2fs: use percpu_counter for # of dirty pages in inode" f2fs: return AOP_WRITEPAGE_ACTIVATE for writepage f2fs: do not activate auto_recovery for fallocated i_size f2fs: fix to determine start_cp_addr by sbi->cur_cp_pack f2fs: fix 32-bit build f2fs: set ->owner for debugfs status file's file_operations f2fs: fix incorrect free inode count in ->statfs f2fs: drop duplicate header timer.h f2fs: fix wrong AUTO_RECOVER condition f2fs: do not recover i_size if it's valid f2fs: fix fdatasync f2fs: fix to account total free nid correctly f2fs: fix an infinite loop when flush nodes in cp f2fs: don't wait writeback for datas during checkpoint f2fs: fix wrong written_valid_blocks counting ...
| * | f2fs: return AOP_WRITEPAGE_ACTIVATE for writepageChao Yu2016-11-301-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | We should use AOP_WRITEPAGE_ACTIVATE when we bypass writing pages. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Miao Xie <miaoxie@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: don't wait writeback for datas during checkpointChao Yu2016-11-251-6/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Normally, while committing checkpoint, we will wait on all pages to be writebacked no matter the page is data or metadata, so in scenario where there are lots of data IO being submitted with metadata, we may suffer long latency for waiting writeback during checkpoint. Indeed, we only care about persistence for pages with metadata, but not pages with data, as file system consistent are only related to metadate, so in order to avoid encountering long latency in above scenario, let's recognize and reference metadata in submitted IOs, wait writeback only for metadatas. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: fix redundant block allocationJaegeuk Kim2016-11-251-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In direct_IO path of f2fs_file_write_iter(), 1. f2fs_preallocate_blocks(F2FS_GET_BLOCK_PRE_DIO) -> allocate LBA X 2. f2fs_direct_IO() -> return 0; Then, f2fs_write_data_page() will allocate another LBA X+1. This makes EIO triggered by HM-SMR. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: use err for f2fs_preallocate_blocksJaegeuk Kim2016-11-251-13/+13
| | | | | | | | | | | | | | | | | | This patch has no functional change. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: support multiple devicesJaegeuk Kim2016-11-251-7/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements multiple devices support for f2fs. Given multiple devices by mkfs.f2fs, f2fs shows them entirely as one big volume under one f2fs instance. Internal block management is very simple, but we will modify block allocation and background GC policy to boost IO speed by exploiting them accoording to each device speed. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: allow dio read for LFS modeJaegeuk Kim2016-11-251-1/+1
| | | | | | | | | | | | | | | | | | | | | We can allow dio reads for LFS mode, while doing buffered writes for dio writes. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: revert segment allocation for direct IOJaegeuk Kim2016-11-251-5/+1Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now we don't need to be too much careful about storage alignment for dio, since its speed becomes quite fast and we'd better avoid any misalignment first. Revert: 38aa0889b250 (f2fs: align direct_io'ed data to section) Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: Use generic zoned block device terminologyDamien Le Moal2016-11-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SMR stands for "Shingled Magnetic Recording" which makes sense only for hard disk drives (spinning rust). The ZBC/ZAC standards enable management of SMR disks, but solid state drives may also support those standards. So rename the HMSMR feature to BLKZONED to avoid a HDD centric terminology. For the same reason, rename f2fs_sb_mounted_hmsmr to f2fs_sb_mounted_blkzoned. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: hide a maybe-uninitialized warningArnd Bergmann2016-11-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gcc is unsure about the use of last_ofs_in_node, which might happen without a prior initialization: fs/f2fs//git/arm-soc/fs/f2fs/data.c: In function ‘f2fs_map_blocks’: fs/f2fs/data.c:799:54: warning: ‘last_ofs_in_node’ may be used uninitialized in this function [-Wmaybe-uninitialized] if (prealloc && dn.ofs_in_node != last_ofs_in_node + 1) { As pointed out by Chao Yu, the code is actually correct as 'prealloc' is only set if the last_ofs_in_node has been set, the two always get updated together. This initializes last_ofs_in_node to dn.ofs_in_node for each new dnode at the start of the 'next_block' loop, which at that point is a correct initialization as well. I assume that compilers that correctly track the contents of the variables and do not warn about the condition also figure out that they can eliminate the extra assignment here. Fixes: 46008c6d4232 ("f2fs: support in batch multi blocks preallocation") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: use BIO_MAX_PAGES for bio allocationJaegeuk Kim2016-11-231-3/+1Star
| | | | | | | | | | | | | | | | | | We don't need to allocate bio partially in order to maximize sequential writes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: be aware of extent beyond EOF in fiemapChao Yu2016-11-231-14/+4Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | f2fs can support fallocating blocks beyond file size without changing the size, but ->fiemap of f2fs was restricted and can't detect these extents fallocated past EOF, now relieve the restriction. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: don't miss any f2fs_balance_fs casesChao Yu2016-11-231-8/+3Star
| | | | | | | | | | | | | | | | | | | | | | | | In f2fs_map_blocks, let f2fs_balance_fs detects node page modification with dn.node_changed to avoid miss some corner cases. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * | f2fs: give a chance to detach from dirty listChao Yu2016-11-231-3/+5
| |/ | | | | | | | | | | | | | | | | If there is no dirty pages in inode, we should give a chance to detach the inode from global dirty list, otherwise it needs to call another unnecessary .writepages for detaching. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* | writeback: add wbc_to_write_flags()Jens Axboe2016-11-021-1/+1
| | | | | | | | | | | | | | | | | | | | Add wbc_to_write_flags(), which returns the write modifier flags to use, based on a struct writeback_control. No functional changes in this patch, but it prepares us for factoring other wbc fields for write type. Signed-off-by: Jens Axboe <axboe@fb.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de>
* | block,fs: use REQ_* flags directlyChristoph Hellwig2016-11-011-9/+7Star
|/ | | | | | | | | Remove the WRITE_* and READ_SYNC wrappers, and just use the flags directly. Where applicable this also drops usage of the bio_set_op_attrs wrapper. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* fs: use mapping_set_error instead of opencoded set_bitMichal Hocko2016-10-121-1/+1
| | | | | | | | | | | The mapping_set_error() helper sets the correct AS_ flag for the mapping so there is no reason to open code it. Use the helper directly. [akpm@linux-foundation.org: be honest about conversion from -ENXIO to -EIO] Link: http://lkml.kernel.org/r/20160912111608.2588-2-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* f2fs: don't submit irrelevant pageChao Yu2016-10-011-1/+7
| | | | | | | | | | | | | While we call ->writepages, there are two cases: a. we didn't writeout any dirty pages, since they are writebacked by other thread concurrently. b. we writeout dirty pages, and have already submitted bio to block layer. In these cases, we don't need to do additional bio flushing unnecessarily, it may split bio in cache into smaller one. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: support configuring fault injection per superblockChao Yu2016-10-011-1/+1
| | | | | | | | | | | | | | | | | Previously, we only support global fault injection configuration, so that when we configure type/rate of fault injection through sysfs, mount option, it will influence all f2fs partition which is being used. It is not make sence, since it will be not convenient if developer want to test separated partitions with different fault injection rate/type simultaneously, also it's not possible to enable fault injection in one partition and disable fault injection in other one. >From now on, we move global configuration of fault injection in module into per-superblock, hence injection testing can be more flexible. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: add customized migrate_page callbackWeichao Guo2016-10-011-0/+55
| | | | | | | | | | | | | | This patch improves the migration of dirty pages and allows migrating atomic written pages that F2FS uses in Page Cache. Instead of the fallback releasing page path, it provides better performance for memory compaction, CMA and other users of memory page migrating. For dirty pages, there is no need to write back first when migrating. For an atomic written page before committing, we can migrate the page and update the related 'inmem_pages' list at the same time. Signed-off-by: Weichao Guo <guoweichao@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: fix some coding style] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: preallocate blocks for encrypted fileYunlei He2016-09-221-5/+1Star
| | | | | | | | | | This patch allow preallocates data blocks for buffered aio writes in encrypted file. Signed-off-by: Yunlei He <heyunlei@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: fix to avoid BUG_ON] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: support IO error injectionChao Yu2016-09-221-0/+5
| | | | | | | | This patch adds to support IO error injection for testing IO error tolerance of f2fs. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix to set PageUptodate in f2fs_write_end correctlyJaegeuk Kim2016-09-131-10/+19
| | | | | | | | | | | | | | | | | | | | | Previously, f2fs_write_begin sets PageUptodate all the time. But, when user tries to update the entire page (i.e., len == PAGE_SIZE), we need to consider that the page is able to be copied partially afterwards. In such the case, we will lose the remaing region in the page. This patch fixes this by setting PageUptodate in f2fs_write_end as given copied result. In the short copy case, it returns zero to let generic_perform_write retry copying user data again. As a result, f2fs_write_end() works: PageUptodate len copied return retry 1. no 4096 4096 4096 false -> return 4096 2. no 4096 1024 0 true -> goto #1 case 3. yes 2048 2048 2048 false -> return 2048 4. yes 2048 1024 1024 false -> return 1024 Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: check free_sections for defragmentationJaegeuk Kim2016-09-121-2/+2
| | | | | | | Fix wrong condition check for defragmentation of a file. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: no need to make zeros beyond i_sizeJaegeuk Kim2016-09-081-9/+0Star
| | | | | | | | We don't need to make zeros beyond i_size, since we already wrote that through NEW_ADDR case. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix to preallocate block only aligned to 4KChao Yu2016-08-301-1/+9
| | | | | | | | | | | | | | | In write_begin(), we skip checking dnode block for preallocating block when whole block needs to be updated since we preallocated its block in f2fs_preallocate_blocks, for partial updated block, we will still try to lock its node and do preallocation in write_begin(), so in f2fs_preallocate_blocks we should not preallocate its block. But previously, the calculation of preallocating block number is incorrect, fix it. Signed-off-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: fix a bug] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix non static symbol warningWei Yongjun2016-08-301-2/+2
| | | | | | | | | | | Fixes the following sparse warning: fs/f2fs/data.c:969:12: warning: symbol 'f2fs_grab_bio' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix to do f2fs_balance_fs in f2fs_map_blocks correctlyChao Yu2016-08-301-0/+1
| | | | | | | | If we preallocate blocks with f2fs_reserve_blocks in f2fs_map_blocks, we should call f2fs_balance_fs for checking and reclaiming space, fix it. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* Revert "f2fs: move i_size_write in f2fs_write_end"Chao Yu2016-08-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit a2ee0a300344a6da76186129b078113354fe13d2. When testing with generic/032 of xfstest suit, failure message will be reported as below: generic/032 8s ... [failed, exit status 1] - output mismatch (see results/generic/032.out.bad) --- tests/generic/032.out 2015-01-11 16:52:27.643681072 +0800 +++ results/generic/032.out.bad 2016-08-06 13:44:43.861330500 +0800 @@ -1,5 +1,5 @@ QA output created by 032 -100 iterations -0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd -* -0100000 +1: [768..775]: unwritten +Unwritten extents found! ... (Run 'diff -u tests/generic/032.out results/generic/032.out.bad' to see the entire diff) Ran: generic/032 Failures: generic/032 Failed 1 of 1 tests In write_end(), we should update i_size of inode before unlock page, otherwise, we will lose newly updated data in following race condition. Thread A Thread B - write_end - unlock page - writepages - lock_page - writepage if page is out-of-range of file size, we will skip writting the page. - update i_size Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: drop bio->bi_rw manual assignmentJens Axboe2016-08-041-1/+0Star
| | | | | | | | Merge 4fc29c1aa375 included this extra line, but it's not needed (or useful) since we'll bio_set_op_attrs() right after to properly set the op and flags for the bio. Signed-off-by: Jens Axboe <axboe@fb.com>
* Merge tag 'for-f2fs-4.8' of ↵Linus Torvalds2016-07-271-149/+162
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "The major change in this version is mitigating cpu overheads on write paths by replacing redundant inode page updates with mark_inode_dirty calls. And we tried to reduce lock contentions as well to improve filesystem scalability. Other feature is setting F2FS automatically when detecting host-managed SMR. Enhancements: - ioctl to move a range of data between files - inject orphan inode errors - avoid flush commands congestion - support lazytime Bug fixes: - return proper results for some dentry operations - fix deadlock in add_link failure - disable extent_cache for fcollapse/finsert" * tag 'for-f2fs-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (68 commits) f2fs: clean up coding style and redundancy f2fs: get victim segment again after new cp f2fs: handle error case with f2fs_bug_on f2fs: avoid data race when deciding checkpoin in f2fs_sync_file f2fs: support an ioctl to move a range of data blocks f2fs: fix to report error number of f2fs_find_entry f2fs: avoid memory allocation failure due to a long length f2fs: reset default idle interval value f2fs: use blk_plug in all the possible paths f2fs: fix to avoid data update racing between GC and DIO f2fs: add maximum prefree segments f2fs: disable extent_cache for fcollapse/finsert inodes f2fs: refactor __exchange_data_block for speed up f2fs: fix ERR_PTR returned by bio f2fs: avoid mark_inode_dirty f2fs: move i_size_write in f2fs_write_end f2fs: fix to avoid redundant discard during fstrim f2fs: avoid mismatching block range for discard f2fs: fix incorrect f_bfree calculation in ->statfs f2fs: use percpu_rw_semaphore ...
| * f2fs: clean up coding style and redundancyJaegeuk Kim2016-07-251-2/+2
| | | | | | | | | | | | This patch includes minor clean-ups. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: use blk_plug in all the possible pathsJaegeuk Kim2016-07-161-0/+3
| | | | | | | | | | | | | | | | | | | | This patch reverts 19a5f5e2ef37 (f2fs: drop any block plugging), and adds blk_plug in write paths additionally. The main reason is that blk_start_plug can be used to wake up from low-power mode before submitting further bios. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: fix to avoid data update racing between GC and DIOChao Yu2016-07-161-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Datas in file can be operated by GC and DIO simultaneously, so we will face race case as below: For write case: Thread A Thread B - generic_file_direct_write - invalidate_inode_pages2_range - f2fs_direct_IO - do_blockdev_direct_IO - do_direct_IO - get_more_blocks - f2fs_gc - do_garbage_collect - gc_data_segment - move_data_page - do_write_data_page migrate data block to new block address - dio_bio_submit update user data to old block address For read case: Thread A Thread B - generic_file_direct_write - invalidate_inode_pages2_range - f2fs_direct_IO - do_blockdev_direct_IO - do_direct_IO - get_more_blocks - f2fs_balance_fs - f2fs_gc - do_garbage_collect - gc_data_segment - move_data_page - do_write_data_page migrate data block to new block address - write_checkpoint - do_checkpoint - clear_prefree_segments - f2fs_issue_discard discard old block adress - dio_bio_submit update user buffer from obsolete block address In order to fix this, for one file, we should let DIO and GC getting exclusion against with each other. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: fix ERR_PTR returned by bioJaegeuk Kim2016-07-161-1/+3
| | | | | | | | | | | | | | | | This is to fix wrong error pointer handling flow reported by Dan. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: move i_size_write in f2fs_write_endJaegeuk Kim2016-07-081-1/+1
| | | | | | | | | | | | We don't need to do i_size_write under page lock. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: call SetPageUptodate if neededJaegeuk Kim2016-07-081-6/+12
| | | | | | | | | | | | | | SetPageUptodate() issues memory barrier, resulting in performance degrdation. Let's avoid that. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: introduce f2fs_set_page_dirty_nobufferJaegeuk Kim2016-07-081-1/+32
| | | | | | | | | | | | | | | | This patch adds f2fs_set_page_dirty_nobuffer() copied from __set_page_dirty_buffer. When appending 4KB blocks in f2fs on pmem with multiple cores, this improves the overall performance. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: fix to detect truncation prior rather than EIO during readChao Yu2016-07-081-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | In procedure of synchonized read, after sending out the read request, reader will try to lock the page for waiting device to finish the read jobs and unlock the page, but meanwhile, truncater will race with reader, so after reader get lock of the page, it should check page's mapping to detect whether someone has truncated the page in advance, then reader has the chance to do the retry if truncation was done, otherwise read can be failed due to previous condition check. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: fix to avoid reading out encrypted data in page cacheChao Yu2016-07-081-43/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For encrypted inode, if user overwrites data of the inode, f2fs will read encrypted data into page cache, and then do the decryption. However reader can race with overwriter, and it will see encrypted data which has not been decrypted by overwriter yet. Fix it by moving decrypting work to background and keep page non-uptodated until data is decrypted. Thread A Thread B - f2fs_file_write_iter - __generic_file_write_iter - generic_perform_write - f2fs_write_begin - f2fs_submit_page_bio - generic_file_read_iter - do_generic_file_read - lock_page_killable - unlock_page - copy_page_to_iter hit the encrypted data in updated page - lock_page - fscrypt_decrypt_page Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: avoid latency-critical readahead of node pagesJaegeuk Kim2016-07-061-1/+1
| | | | | | | | | | | | | | The f2fs_map_blocks is very related to the performance, so let's avoid any latency to read ahead node pages. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: detect host-managed SMR by feature flagJaegeuk Kim2016-07-061-1/+2
| | | | | | | | | | | | | | If mkfs.f2fs gives a feature flag for host-managed SMR, we can set mode=lfs by default. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: introduce mode=lfs mount optionJaegeuk Kim2016-06-131-0/+2
| | | | | | | | | | | | | | | | | | This mount option is to enable original log-structured filesystem forcefully. So, there should be no random writes for main area. Especially, this supports host-managed SMR device. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: drop any block pluggingJaegeuk Kim2016-06-081-7/+10
| | | | | | | | | | | | | | In f2fs, we don't need to keep block plugging for NODE and DATA writes, since we already merged bios as much as possible. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: set mapping error for EIOJaegeuk Kim2016-06-081-1/+1
| | | | | | | | | | | | If EIO occurred, we need to set all the mapping to avoid any further IOs. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
| * f2fs: handle writepage correctlyJaegeuk Kim2016-06-031-30/+14Star
| | | | | | | | | | | | | | | | | | | | | | | | | | Previously, f2fs_write_data_pages() calls __f2fs_writepage() which calls f2fs_write_data_page(). If f2fs_write_data_page() returns AOP_WRITEPAGE_ACTIVATE, __f2fs_writepage() calls mapping_set_error(). But, this should not happen at every time, since sometimes f2fs_write_data_page() tries to skip writing pages without error. For example, volatile_write() gives EIO all the time, as Shuoran Liu pointed out. Reported-by: Shuoran Liu <liushuoran@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>