summaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* f2fs: avoid to use a NULL point in destroy_segment_managerChao Yu2013-11-061-0/+2
| | | | | | | | A NULL point should avoid to be used in destroy_segment_manager after allocating memory fail for f2fs_sm_info. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: remove unnecessary TestClearPageError when wait pages writebackChao Yu2013-11-041-3/+4
| | | | | | | | | In wait_on_node_pages_writeback we will test and clear error flag for all pages in radix tree, but not necessary. So we only do this for pages belong to the specified inode. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: avoid to wait all the node blocks during fsyncJaegeuk Kim2013-10-313-2/+44
| | | | | | | | | | Previously, f2fs_sync_file() waits for all the node blocks to be written. But, we don't need to do that, but wait only the inode-related node blocks. This patch adds wait_on_node_pages_writeback() in which waits inode-related node blocks that are on writeback. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: check all ones or zeros bitmap with bitops for better mount performanceChao Yu2013-10-301-4/+15
| | | | | | | | | | | | | | | | | | Previously, check_block_count check valid_map with bit data type in common scenario that sit has all ones or zeros bitmap, it makes low mount performance. So let's check the special bitmap with integer data type instead of the bit one. v1-->v2: o use find_next_{zero_}bit_le for better performance and readable as Jaegeuk suggested. o use neat logogram in comment as Gu Zheng suggested. o search continuous ones or zeros for better performance when checking mixed bitmap. Suggested-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Signed-off-by: Shu Tan <shu.tan@samsung.com> Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: change the method of calculating the number summary blocksFan Li2013-10-301-8/+6Star
| | | | | | | | | | | | | npages_for_summary_flush uses (SUMMARY_SIZE + 1) as the size of a f2fs_summary while its actual size is SUMMARY_SIZE. So the result sometimes is bigger than actual number by one, which causes checkpoint can't be written into disk contiguously, and sometimes summary blocks can't be compacted like they should. Besides, when writing summary blocks into pages, if remain space in a page isn't big enough for one f2fs_summary, it will be left unused, current code seems not to take it into account. Signed-off-by: Fan Li <fanofcode.li@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: fix calculating incorrect free size when update xattr in __f2fs_setxattrChao Yu2013-10-291-1/+1
| | | | | | | | | | During xattr updating, free size should be corrected to remainder free size + old entry size. It can avoid ENOSPC error when we update old entry with the same size new entry at fully filled xattr. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: add an option to avoid unnecessary BUG_ONsJaegeuk Kim2013-10-2911-57/+65
| | | | | | | If you want to remove unnecessary BUG_ONs, you can just turn off F2FS_CHECK_FS in your kernel config. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: introduce CONFIG_F2FS_CHECK_FS for BUG_ON controlJaegeuk Kim2013-10-291-0/+8
| | | | | | | This config will support an option to remove so many BUG_ONs that degrade the performance potentially. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: fix a deadlock during init_acl procedureJaegeuk Kim2013-10-283-10/+12
| | | | | | | | | | | | | | | | | | | | | | The deadlock is found through the following scenario. sys_mkdir() -> f2fs_add_link() -> __f2fs_add_link() -> init_inode_metadata() : lock_page(inode); -> f2fs_init_acl() -> f2fs_set_acl() -> f2fs_setxattr(..., NULL) : This NULL page incurs a deadlock at update_inode_page(). So, likewise f2fs_init_security(), this patch adds a parameter to transfer the locked inode page to f2fs_setxattr(). Found by Linux File System Verification project (linuxtesting.org). Reported-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: clean up acl flow for better readabilityJaegeuk Kim2013-10-282-15/+16
| | | | | | This patch cleans up a couple of acl codes. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: remove unnecessary segment bitmap updatesChangman Lee2013-10-281-23/+8Star
| | | | | | | | Only one dirty type is set in __locate_dirty_segment and we can know dirty type of segment. So we don't need to check other dirty types. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: add tracepoint for vm_page_mkwriteJaegeuk Kim2013-10-251-0/+1
| | | | | | This patch adds a tracepoint for f2fs_vm_page_mkwrite. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: add tracepoint for set_page_dirtyJaegeuk Kim2013-10-253-0/+6
| | | | | | This patch adds a tracepoint for set_page_dirty. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: remove redundant set_page_dirty from write_compacted_summariesChao Yu2013-10-251-4/+4
| | | | | | | | | | Previously, set_page_dirty is called every time after writting one summary info into compacted summary page, To avoid redundant set_page_dirty, we only call set_page_dirty before release page. Signed-off-by: Yu Chao <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: add reclaiming control by sysfsJaegeuk Kim2013-10-251-14/+36
| | | | | | | This patch adds a control method in sysfs to reclaim prefree segments. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: introduce f2fs_balance_fs_bg for some background jobsJaegeuk Kim2013-10-254-10/+15
| | | | | | | This patch merges some background jobs into this new function. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: reclaim prefree segments periodicallyJaegeuk Kim2013-10-255-1/+18
| | | | | | | | | | | | | | | | Previously, f2fs postpones reclaiming prefree segments into free segments as much as possible. However, if user writes and deletes a bunch of data without any sync or fsync calls, some flash storages can suffer from garbage collections. So, this patch adds the reclaiming codes to f2fs_write_node_pages and background GC thread. If there are a lot of prefree segments, let's do checkpoint so that f2fs submits discard commands for the prefree regions to the flash storage. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: use bool for booleansHaicheng Li2013-10-255-11/+11
| | | | | Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: clean up several status-related operationsJaegeuk Kim2013-10-255-25/+27
| | | | | | This patch cleans up improper definitions that update some status information. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: introduce f2fs_kmem_cache_alloc to hide the unfailed, kmem cache ↵Gu Zheng2013-10-224-41/+35Star
| | | | | | | | | | | | | | allocation Introduce the unfailed version of kmem_cache_alloc named f2fs_kmem_cache_alloc to hide the retry routine and make the code a bit cleaner. v2: Fix the wrong use of 'retry' tag pointed out by Gao feng. Use more neat code to remove redundant tag suggested by Haicheng Li. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: no need to check other dirty_segmap when the seg has been foundHaicheng Li2013-10-221-3/+7
| | | | | | | | Because one dirty seg can only be mapped to one dirty_type. Otherwise, it's a bug. Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> [Jaegeuk Kim: modify a comment related to this patch] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: use true and false for boolean valueHaicheng Li2013-10-221-2/+2
| | | | | Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: avoid to write during the recoveryJaegeuk Kim2013-10-182-7/+12
| | | | | | | | | This patch enhances the recovery routine not to write any data/node/meta until its completion. If any writes are sent to the disk, it could contaminate the written history that will be used for further recovery. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: avoid wait if IO end up when do_checkpoint for better performanceGu Zheng2013-10-183-2/+14
| | | | | | | | | | | | | | | | | | Previously, do_checkpoint() will call congestion_wait() for waiting the pages (previous submitted node/meta/data pages) to be written back. Because congestion_wait() will set a regular period (e.g. HZ / 50 ) for waiting, and no additional wake up mechanism was introduced if IO ends up before regular period costed. Yuan Zhong found there is a situation that after the pages have been written back, but the checkpoint thread still wait for congestion_wait to exit. So here we store checkpoint task into f2fs_sb when doing checkpoint, it'll wait for IO completes if there's IO going on, and in the end IO path, wake up checkpoint task when IO ends up. Thanks to Yuan Zhong's pre work about this problem. Reported-by: Yuan Zhong <yuan.mark.zhong@samsung.com> Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: introduce function read_raw_super_block()Gu Zheng2013-10-181-21/+34
| | | | | | | | Introduce function read_raw_super_block() to hide reading raw super block and the retry routine if the first sb is invalid. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: fix the starvation problem on cp_rwsemJaegeuk Kim2013-10-182-14/+0Star
| | | | | | | | | | | | | This patch removes the logic previously introduced to address the starvation on cp_rwsem. One potential there-in bug is that we should cover the wait.list with spin_lock, but the previous code broke this rule. And, actually current rwsem handles this starvation issue reasonably, so that we didn't need to do this before neither. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: fix to store and retrieve i_rdev correctlyJaegeuk Kim2013-10-181-17/+32
| | | | | | When storing i_rdev, we should check its file type. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: fix writing incorrect orphan blocksJaegeuk Kim2013-10-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | Previously, there was a erroneous scenario like below. thread 1: thread 2: f2fs_unlink - acquire_orphan_inode : sbi->n_orphans++ write_checkpoint - block_operations : f2fs_lock_all - do_checkpoint : write orphan blocks with sbi->n_orphans - unblock_operations - f2fs_lock_op - release_orphan_inode - f2fs_unlock_op During the checkpoint by thread 2, f2fs stores a wrong orphan block according to the wrong sbi->n_orphans. To avoid this, simply we should make cover acquire_orphan_inode too with f2fs_lock_op. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: avoid unnecessary checkpointsJaegeuk Kim2013-10-081-1/+3
| | | | | | | During the f2fs_put_super procedure, we don't need to conduct checkpoint all the time, since we don't need to do that if superblock is clean. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: handle remount options correctlyKelly Anderson2013-10-071-0/+17
| | | | | | | | | | The current f2fs code errors if the xattr or acl options are passed when remounting. This is important in a typical scenario where f2fs is mounted as a "ro" root file-system by the boot loader and then the init process wants to remount it "rw" with the "remount,rw" option. Signed-off-by: Kelly Anderson <kelly@xilka.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: use rw_sem instead of fs_lock(locks mutex)Gu Zheng2013-10-079-116/+83Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The fs_locks is used to block other ops(ex, recovery) when doing checkpoint. And each other operate routine(besides checkpoint) needs to acquire a fs_lock, there is a terrible problem here, if these are too many concurrency threads acquiring fs_lock, so that they will block each other and may lead to some performance problem, but this is not the phenomenon we want to see. Though there are some optimization patches introduced to enhance the usage of fs_lock, but the thorough solution is using a *rw_sem* to replace the fs_lock. Checkpoint routine takes write_sem, and other ops take read_sem, so that we can block other ops(ex, recovery) when doing checkpoint, and other ops will not disturb each other, this can avoid the problem described above completely. Because of the weakness of rw_sem, the above change may introduce a potential problem that the checkpoint thread might get starved if other threads are intensively locking the read semaphore for I/O.(Pointed out by Xu Jin) In order to avoid this, a wait_list is introduced, the appending read semaphore ops will be dropped into the wait_list if checkpoint thread is waiting for write semaphore, and will be waked up when checkpoint thread gives up write semaphore. Thanks to Kim's previous review and test, and will be very glad to see other guys' performance tests about this patch. V2: -fix the potential starvation problem. -use more suitable func name suggested by Xu Jin. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> [Jaegeuk Kim: adjust minor coding standard] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: account for orphan inodes during recoveryRuss W. Knize2013-09-251-6/+13
| | | | | | | | | | | | During recovery, orphan inodes are deleted via truncate_hole(). These orphans are added by recover_dentry() via f2fs_delete_entry(). However, f2fs_delete_entry() adds them via add_orphan_inode() without calling acquire_orphan_inode() first. This prevents the counters from being incremented properly, which causes them to underflow when remove_orphan_inode() is called later on. Signed-off-by: Russ Knize <rknize@motorola.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: don't GC or take an fs_lock from f2fs_initxattrs()Russ Knize2013-09-251-10/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | f2fs_initxattrs() is called internally from within F2FS and should not call functions that are used by VFS handlers. This avoids certain deadlocks: - vfs_create() - f2fs_create() <-- takes an fs_lock - f2fs_add_link() - __f2fs_add_link() - init_inode_metadata() - f2fs_init_security() - security_inode_init_security() - f2fs_initxattrs() - f2fs_setxattr() <-- also takes an fs_lock If the caller happens to grab the same fs_lock from the pool in both places, they will deadlock. There are also deadlocks involving multiple threads and mutexes: - f2fs_write_begin() - f2fs_balance_fs() <-- takes gc_mutex - f2fs_gc() - write_checkpoint() - block_operations() - mutex_lock_all() <-- blocks trying to grab all fs_locks - f2fs_mkdir() <-- takes an fs_lock - __f2fs_add_link() - f2fs_init_security() - security_inode_init_security() - f2fs_initxattrs() - f2fs_setxattr() - f2fs_balance_fs() <-- blocks trying to take gc_mutex Signed-off-by: Russ Knize <Russ.Knize@motorola.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: don't let the orphan inode counter underflowRuss W. Knize2013-09-251-0/+2
| | | | | | | | | | Accounting errors from buggy code calling the acquire/release/remove orphan inode interfaces can cause n_orphans to underflow, which will then cause acquire_orphan_inode() to return -ENOSPC on the next operation. This commit guards against that condition. Signed-off-by: Russ Knize <rknize@motorola.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: remove unneeded write checkpoint in recover_fsync_dataChao Yu2013-09-251-1/+4
| | | | | | | | | | | | Previously, recover_fsync_data still to write checkpoint when there is nothing to recover with normal umount image. It may reduce mount performance and flash memory lifetime, so let's remove it. Signed-off-by: Tan Shu <shu.tan@samsung.com> Signed-off-by: Yu Chao <chao2.yu@samsung.com> Reviewed-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: avoid allocating failure in bio_allocChao Yu2013-09-242-1/+5
| | | | | | | | | | This patch add macro MAX_BIO_BLOCKS to limit value of npages in f2fs_bio_alloc, it can avoid allocating failure in bio_alloc caused by npages is larger than BIO_MAX_PAGES. Signed-off-by: Yu Chao <chao2.yu@samsung.com> Reviewed-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: optimize the victim searching loop slightlyJin Xu2013-09-241-6/+9
| | | | | | | | | | | | | | Since the MAX_VICTIM_SEARCH has been enlarged from 20 to 4096, the victim searching overhead will be increased much than before, especially for SSR that searches victim for use quiet often. This patch intends to reduce the overhead a little bit by: - make the get_gc_cost a inline routine to reduce function call overhead - reduce multiplication and division operations - reduce unnecessary comparison operation Signed-off-by: Jin Xu <jinuxstyle@gmail.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* f2fs: optimize fs_lock for better performanceYu Chao2013-09-241-2/+2
| | | | | | | | | | | | | | | | | There is a performance problem: when all sbi->fs_lock are holded, then all the following threads may get the same next_lock value from sbi->next_lock_num in function mutex_lock_op, and wait for the same lock(fs_lock[next_lock]), it may cause performance reduce. So we move the sbi->next_lock_num++ before getting lock, this will average the following threads if all sbi->fs_lock are holded. v1-->v2: Drop the needless spin_lock as Jaegeuk suggested. Suggested-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Signed-off-by: Yu Chao <chao2.yu@samsung.com> Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* Merge branch 'for-3.12/core' of git://git.kernel.dk/linux-blockLinus Torvalds2013-09-231-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block IO fixes from Jens Axboe: "After merge window, no new stuff this time only a collection of neatly confined and simple fixes" * 'for-3.12/core' of git://git.kernel.dk/linux-block: cfq: explicitly use 64bit divide operation for 64bit arguments block: Add nr_bios to block_rq_remap tracepoint If the queue is dying then we only call the rq->end_io callout. This leaves bios setup on the request, because the caller assumes when the blk_execute_rq_nowait/blk_execute_rq call has completed that the rq->bios have been cleaned up. bio-integrity: Fix use of bs->bio_integrity_pool after free blkcg: relocate root_blkg setting and clearing block: Convert kmalloc_node(...GFP_ZERO...) to kzalloc_node(...) block: trace all devices plug operation
| * bio-integrity: Fix use of bs->bio_integrity_pool after freeBjorn Helgaas2013-09-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | This fixes a copy and paste error introduced by 9f060e2231 ("block: Convert integrity to bvec_alloc_bs()"). Found by Coverity (CID 1020654). Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Kent Overstreet <koverstreet@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | Merge branch 'for-linus' of ↵Linus Torvalds2013-09-2220-175/+363
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "These are mostly bug fixes and a two small performance fixes. The most important of the bunch are Josef's fix for a snapshotting regression and Mark's update to fix compile problems on arm" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (25 commits) Btrfs: create the uuid tree on remount rw btrfs: change extent-same to copy entire argument struct Btrfs: dir_inode_operations should use btrfs_update_time also btrfs: Add btrfs: prefix to kernel log output btrfs: refuse to remount read-write after abort Btrfs: btrfs_ioctl_default_subvol: Revert back to toplevel subvolume when arg is 0 Btrfs: don't leak transaction in btrfs_sync_file() Btrfs: add the missing mutex unlock in write_all_supers() Btrfs: iput inode on allocation failure Btrfs: remove space_info->reservation_progress Btrfs: kill delay_iput arg to the wait_ordered functions Btrfs: fix worst case calculator for space usage Revert "Btrfs: rework the overcommit logic to be based on the total size" Btrfs: improve replacing nocow extents Btrfs: drop dir i_size when adding new names on replay Btrfs: replay dir_index items before other items Btrfs: check roots last log commit when checking if an inode has been logged Btrfs: actually log directory we are fsync()'ing Btrfs: actually limit the size of delalloc range Btrfs: allocate the free space by the existed max extent size when ENOSPC ...
| * | Btrfs: create the uuid tree on remount rwJosef Bacik2013-09-211-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Users have been complaining of the uuid tree stuff warning that there is no uuid root when trying to do snapshot operations. This is because if you mount -o ro we will not create the uuid tree. But then if you mount -o rw,remount we will still not create it and then any subsequent snapshot/subvol operations you try to do will fail gloriously. Fix this by creating the uuid_root on remount rw if it was not already there. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | btrfs: change extent-same to copy entire argument structMark Fasheh2013-09-211-31/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | btrfs_ioctl_file_extent_same() uses __put_user_unaligned() to copy some data back to it's argument struct. Unfortunately, not all architectures provide __put_user_unaligned(), so compiles break on them if btrfs is selected. Instead, just copy the whole struct in / out at the start and end of operations, respectively. Signed-off-by: Mark Fasheh <mfasheh@suse.de> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | Btrfs: dir_inode_operations should use btrfs_update_time alsoGuangyu Sun2013-09-211-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 2bc5565286121d2a77ccd728eb3484dff2035b58 (Btrfs: don't update atime on RO subvolumes) ensures that the access time of an inode is not updated when the inode lives in a read-only subvolume. However, if a directory on a read-only subvolume is accessed, the atime is updated. This results in a write operation to a read-only subvolume. I believe that access times should never be updated on read-only subvolumes. To reproduce: # mkfs.btrfs -f /dev/dm-3 (...) # mount /dev/dm-3 /mnt # btrfs subvol create /mnt/sub Create subvolume '/mnt/sub' # mkdir /mnt/sub/dir # echo "abc" > /mnt/sub/dir/file # btrfs subvol snapshot -r /mnt/sub /mnt/rosnap Create a readonly snapshot of '/mnt/sub' in '/mnt/rosnap' # stat /mnt/rosnap/dir File: `/mnt/rosnap/dir' Size: 8 Blocks: 0 IO Block: 4096 directory Device: 16h/22d Inode: 257 Links: 1 Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2013-09-11 07:21:49.389157126 -0400 Modify: 2013-09-11 07:22:02.330156079 -0400 Change: 2013-09-11 07:22:02.330156079 -0400 # ls /mnt/rosnap/dir file # stat /mnt/rosnap/dir File: `/mnt/rosnap/dir' Size: 8 Blocks: 0 IO Block: 4096 directory Device: 16h/22d Inode: 257 Links: 1 Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2013-09-11 07:22:56.797151670 -0400 Modify: 2013-09-11 07:22:02.330156079 -0400 Change: 2013-09-11 07:22:02.330156079 -0400 Reported-by: Koen De Wit <koen.de.wit@oracle.com> Signed-off-by: Guangyu Sun <guangyu.sun@oracle.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | btrfs: Add btrfs: prefix to kernel log outputFrank Holton2013-09-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel log entries for device label %s and device fsid %pU are missing the btrfs: prefix. Add those here. Signed-off-by: Frank Holton <fholton@gmail.com> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | btrfs: refuse to remount read-write after abortDavid Sterba2013-09-211-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's still possible to flip the filesystem into RW mode after it's remounted RO due to an abort. There are lots of places that check for the superblock error bit and will not write data, but we should not let the filesystem appear read-write. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | Btrfs: btrfs_ioctl_default_subvol: Revert back to toplevel subvolume when ↵chandan2013-09-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arg is 0 This patch makes it possible to set BTRFS_FS_TREE_OBJECTID as the default subvolume by passing a subvolume id of 0. Signed-off-by: chandan <chandan@linux.vnet.ibm.com> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | Btrfs: don't leak transaction in btrfs_sync_file()Filipe David Borba Manana2013-09-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | In btrfs_sync_file(), if the call to btrfs_log_dentry_safe() returns a negative error (for e.g. -ENOMEM via btrfs_log_inode()), we would return without ending/freeing the transaction. Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | Btrfs: add the missing mutex unlock in write_all_supers()Stefan Behrens2013-09-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The BUG() was replaced by btrfs_error() and return -EIO with the patch "get rid of one BUG() in write_all_supers()", but the missing mutex_unlock() was overlooked. The 0-DAY kernel build service from Intel reported the missing unlock which was found by the coccinelle tool: fs/btrfs/disk-io.c:3422:2-8: preceding lock on line 3374 Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | Btrfs: iput inode on allocation failureJosef Bacik2013-09-211-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | We don't do the iput when we fail to allocate our delayed delalloc work in __start_delalloc_inodes, fix this. Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>