summaryrefslogtreecommitdiffstats
path: root/fs/btrfs/extent_io.h
Commit message (Collapse)AuthorAgeFilesLines
...
| * btrfs: sink gfp parameter to set_record_extent_bitsDavid Sterba2016-04-291-2/+1Star
| | | | | | | | | | | | Single caller passes GFP_NOFS. Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: sink gfp parameter to set_extent_newDavid Sterba2016-04-291-2/+3
| | | | | | | | | | | | Single caller passes GFP_NOFS. Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: sink gfp parameter to set_extent_defragDavid Sterba2016-04-291-2/+2
| | | | | | | | | | | | Single caller passes GFP_NOFS. Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: sink gfp parameter to set_extent_delallocDavid Sterba2016-04-291-2/+2
| | | | | | | | | | | | | | Callers pass GFP_NOFS and tests pass GFP_KERNEL, but using NOFS there does not hurt. No need to pass the flags around. Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: sink gfp parameter to clear_extent_dirtyDavid Sterba2016-04-291-2/+2
| | | | | | | | | | | | Callers pass GFP_NOFS. No need to pass the flags around. Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: sink gfp parameter to clear_record_extent_bitsDavid Sterba2016-04-291-2/+1Star
| | | | | | | | | | | | Callers pass GFP_NOFS. No need to pass the flags around. Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: sink gfp parameter to clear_extent_bitsDavid Sterba2016-04-291-2/+3
| | | | | | | | | | | | Callers pass GFP_NOFS and GFP_KERNEL. No need to pass the flags around. Signed-off-by: David Sterba <dsterba@suse.com>
| * btrfs: sink gfp parameter to set_extent_bitsDavid Sterba2016-04-291-2/+2
| | | | | | | | | | | | All callers pass GFP_NOFS. Signed-off-by: David Sterba <dsterba@suse.com>
* | btrfs: kill unused writepage_io_hook callbackDavid Sterba2016-05-061-1/+0Star
|/ | | | | | | | | It seems to be long time unused, since 2008 and 6885f308b5570 ("Btrfs: Misc 2.6.25 updates"). Propagating the removal touches some code but has no functional effect. Signed-off-by: David Sterba <dsterba@suse.com>
* mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macrosKirill A. Shutemov2016-04-041-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'cleanups-4.6' into for-chris-4.6David Sterba2016-02-261-3/+2Star
|\
| * btrfs: use proper type for failrec in extent_stateDavid Sterba2016-02-181-3/+2Star
| | | | | | | | | | | | | | We use the private member of extent_state to store the failrec and play pointless pointer games. Signed-off-by: David Sterba <dsterba@suse.com>
* | Btrfs: remove no longer used function extent_read_full_page_nolock()Filipe Manana2016-02-031-3/+0Star
|/ | | | | | | Not needed after the previous patch named "Btrfs: fix page reading in extent_same ioctl leading to csum errors". Signed-off-by: Filipe Manana <fdmanana@suse.com>
* Merge branch 'freespace-4.5' into for-linus-4.5Chris Mason2015-12-231-1/+9
|\
| * Merge branch 'freespace-tree' into for-linus-4.5Chris Mason2015-12-181-1/+9
| |\ | | | | | | | | | Signed-off-by: Chris Mason <clm@fb.com>
| | * Btrfs: add extent buffer bitmap sanity testsOmar Sandoval2015-12-171-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | Sanity test the extent buffer bitmap operations (test, set, and clear) against the equivalent standard kernel operations. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
| | * Btrfs: add extent buffer bitmap operationsOmar Sandoval2015-12-171-0/+6
| | | | | | | | | | | | | | | | | | | | | These are going to be used for the free space tree bitmap items. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
* | | Merge branch 'dev/simplify-set-bit' of ↵Chris Mason2015-12-231-23/+91
|\ \ \ | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.5 Signed-off-by: Chris Mason <clm@fb.com>
| * | | btrfs: make lock_extent static inlineDavid Sterba2015-12-031-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | One call less reduces stack usage, code slightly reduced as well. Signed-off-by: David Sterba <dsterba@suse.com>
| * | | btrfs: drop unused parameter from lock_extent_bitsDavid Sterba2015-12-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | We've always passed 0. Stack usage will slightly decrease. Signed-off-by: David Sterba <dsterba@suse.com>
| * | | btrfs: make clear_extent_bit helpers static inlineDavid Sterba2015-12-031-9/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The funcions just wrap the clear_extent_bit API and generate function calls. This increases stack consumption and may negatively affect performance due to icache misses. We can simply make the helpers static inline and keep the type checking and API untouched. The code slightly decreases: text data bss dec hex filename 938667 43670 23144 1005481 f57a9 fs/btrfs/btrfs.ko.before 939651 43670 23144 1006465 f5b81 fs/btrfs/btrfs.ko.after Signed-off-by: David Sterba <dsterba@suse.com>
| * | | btrfs: make set_extent_bit helpers static inlineDavid Sterba2015-12-031-12/+46
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The funcions just wrap the set_extent_bit API and generate function calls. This increases stack consumption and may negatively affect performance due to icache misses. We can simply make the helpers static inline and keep the type checking and API untouched. The code slightly increases: text data bss dec hex filename 938427 43670 23144 1005241 f56b9 fs/btrfs/btrfs.ko.before 938667 43670 23144 1005481 f57a9 fs/btrfs/btrfs.ko Signed-off-by: David Sterba <dsterba@suse.com>
* | | btrfs: make extent_range_redirty_for_io return voidDavid Sterba2015-12-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Does not return any errors, nor anything from the callgraph. There's a BUG_ON but it's a sanity check and not an error condition we could recover from. Signed-off-by: David Sterba <dsterba@suse.com>
* | | btrfs: make extent_range_clear_dirty_for_io return voidDavid Sterba2015-12-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Does not return any errors, nor anything from the callgraph. There's a BUG_ON but it's a sanity check and not an error condition we could recover from. Signed-off-by: David Sterba <dsterba@suse.com>
* | | btrfs: make end_extent_writepage return voidDavid Sterba2015-12-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Does not return any errors, nor anything from the callgraph. The branch in end_bio_extent_writepage has been skipped since 5fd02043553b ("Btrfs: finish ordered extents in their own thread"). Signed-off-by: David Sterba <dsterba@suse.com>
* | | btrfs: make extent_clear_unlock_delalloc return voidDavid Sterba2015-12-071-1/+1
| | | | | | | | | | | | | | | | | | Does not return any errors, nor anything from the callgraph. Signed-off-by: David Sterba <dsterba@suse.com>
* | | btrfs: make clear_extent_buffer_uptodate return voidDavid Sterba2015-12-071-1/+1
| | | | | | | | | | | | | | | | | | Does not return any errors, nor anything from the callgraph. Signed-off-by: David Sterba <dsterba@suse.com>
* | | btrfs: make set_extent_buffer_uptodate return voidDavid Sterba2015-12-071-1/+1
|/ / | | | | | | | | | | Does not return any errors, nor anything from the callgraph. Signed-off-by: David Sterba <dsterba@suse.com>
* | btrfs: qgroup: Introduce btrfs_qgroup_reserve_data functionQu Wenruo2015-10-221-0/+1
| | | | | | | | | | | | | | | | Introduce a new function, btrfs_qgroup_reserve_data(), which will use io_tree to accurate qgroup reserve, to avoid reserved space leaking. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
* | btrfs: extent_io: Introduce new function clear_record_extent_bits()Qu Wenruo2015-10-221-0/+3
| | | | | | | | | | | | | | | | | | | | | | Introduce new function clear_record_extent_bits(), which will clear bits for given range and record the details about which ranges are cleared and how many bytes in total it changes. This provides the basis for later qgroup reserve codes. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
* | btrfs: extent_io: Introduce new function set_record_extent_bitsQu Wenruo2015-10-221-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Introduce new function set_record_extent_bits(), which will not only set given bits, but also record how many bytes are changed, and detailed range info. This is quite important for later qgroup reserve framework. The number of bytes will be used to do qgroup reserve, and detailed range info will be used to cleanup for EQUOT case. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
* | btrfs: extent_io: Introduce needed structure for recoding set/clear bitsQu Wenruo2015-10-221-0/+12
|/ | | | | | | | | | | Add a new structure, extent_change_set, to record how many bytes are changed in one set/clear_extent_bits() operation, with detailed changed ranges info. This provides the needed facilities for later qgroup reserve framework. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
* btrfs: constify structs with op functions or static definitionsDavid Sterba2015-02-161-1/+1
| | | | | | | There are some op tables that can be easily made const, similarly the sysfs feature and raid tables. This is motivated by PaX CONSTIFY plugin. Signed-off-by: David Sterba <dsterba@suse.cz>
* btrfs: switch extent_state state to unsignedDavid Sterba2015-01-221-29/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently there's a 4B hole in the structure between refs and state and there are only 16 bits used so we can make it unsigned. This will get a better packing and may save some stack space for local variables. The size of extent_state gets reduced by 8B and there are usually a lot of slab objects. struct extent_state { u64 start; /* 0 8 */ u64 end; /* 8 8 */ struct rb_node rb_node; /* 16 24 */ wait_queue_head_t wq; /* 40 24 */ /* --- cacheline 1 boundary (64 bytes) --- */ atomic_t refs; /* 64 4 */ /* XXX 4 bytes hole, try to pack */ long unsigned int state; /* 72 8 */ u64 private; /* 80 8 */ /* size: 88, cachelines: 2, members: 7 */ /* sum members: 84, holes: 1, sum holes: 4 */ /* last cacheline: 24 bytes */ }; Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
* btrfs: sink parameter len to alloc_extent_bufferDavid Sterba2014-12-121-2/+2
| | | | | | | Because we're using globally known nodesize. Do the same for the sanity test function variant. Signed-off-by: David Sterba <dsterba@suse.cz>
* btrfs: unify extent buffer allocation apiDavid Sterba2014-12-121-1/+2
| | | | | | | | | | | Make the extent buffer allocation interface consistent. Cloned eb will set a valid fs_info. For dummy eb, we can drop the length parameter and set it from fs_info. The built-in sanity checks may pass a NULL fs_info that's queried for nodesize, but we know it's 4096. Signed-off-by: David Sterba <dsterba@suse.cz>
* Btrfs: set page and mapping error on compressed write failureFilipe Manana2014-11-211-0/+1
| | | | | | | | | | | | | | | | If we fail in submit_compressed_extents() before calling btrfs_submit_compressed_write(), we start and end the writeback for the pages (clear their dirty flag, unlock them, etc) but we don't tag the pages, nor the inode's mapping, with an error. This makes it impossible for a caller of filemap_fdatawait_range() (fsync, or transaction commit for e.g.) know that there was an error. Note that the return value of submit_compressed_extents() is useless, as that function is executed by a workqueue task and not directly by the fill_delalloc callback. This means the writepage/s callbacks of the inode's address space operations don't get that return value. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
* Btrfs: fix compiles when CONFIG_BTRFS_FS_RUN_SANITY_TESTS is offChris Mason2014-10-071-1/+1
| | | | | | | | | Commit fccb84c94 moved added some helpers to cleanup our sanity tests, but it looks like both Dave and I always compile with the tests enabled. This fixes things to work when they are turned off too. Signed-off-by: Chris Mason <clm@fb.com>
* Merge branch 'cleanup/misc-for-3.18' of ↵Chris Mason2014-10-041-10/+0Star
|\ | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus Signed-off-by: Chris Mason <clm@fb.com> Conflicts: fs/btrfs/extent_io.c
| * btrfs: kill extent_buffer_page helperDavid Sterba2014-10-021-6/+0Star
| | | | | | | | | | | | It used to be more complex but now it's just a simple array access. Signed-off-by: David Sterba <dsterba@suse.cz>
| * btrfs: remove unused extent state bitsDavid Sterba2014-10-021-4/+0Star
| | | | | | | | | | | | The last users are long gone. Signed-off-by: David Sterba <dsterba@suse.cz>
* | Btrfs: be aware of btree inode write errors to avoid fs corruptionFilipe Manana2014-10-041-2/+5
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While we have a transaction ongoing, the VM might decide at any time to call btree_inode->i_mapping->a_ops->writepages(), which will start writeback of dirty pages belonging to btree nodes/leafs. This call might return an error or the writeback might finish with an error before we attempt to commit the running transaction. If this happens, we might have no way of knowing that such error happened when we are committing the transaction - because the pages might no longer be marked dirty nor tagged for writeback (if a subsequent modification to the extent buffer didn't happen before the transaction commit) which makes filemap_fdata[write|wait]_range unable to find such pages (even if they're marked with SetPageError). So if this happens we must abort the transaction, otherwise we commit a super block with btree roots that point to btree nodes/leafs whose content on disk is invalid - either garbage or the content of some node/leaf from a past generation that got cowed or deleted and is no longer valid (for this later case we end up getting error messages like "parent transid verify failed on 10826481664 wanted 25748 found 29562" when reading btree nodes/leafs from disk). Note that setting and checking AS_EIO/AS_ENOSPC in the btree inode's i_mapping would not be enough because we need to distinguish between log tree extents (not fatal) vs non-log tree extents (fatal) and because the next call to filemap_fdatawait_range() will catch and clear such errors in the mapping - and that call might be from a log sync and not from a transaction commit, which means we would not know about the error at transaction commit time. Also, checking for the eb flag EXTENT_BUFFER_IOERR at transaction commit time isn't done and would not be completely reliable, as the eb might be removed from memory and read back when trying to get it, which clears that flag right before reading the eb's pages from disk, making us not know about the previous write error. Using the new 3 flags for the btree inode also makes us achieve the goal of AS_EIO/AS_ENOSPC when writepages() returns success, started writeback for all dirty pages and before filemap_fdatawait_range() is called, the writeback for all dirty pages had already finished with errors - because we were not using AS_EIO/AS_ENOSPC, filemap_fdatawait_range() would return success, as it could not know that writeback errors happened (the pages were no longer tagged for writeback). Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
* Btrfs: cleanup the read failure record after write or when the inode is freeingMiao Xie2014-09-171-0/+1
| | | | | | | | | | | | | | | | | After the data is written successfully, we should cleanup the read failure record in that range because - If we set data COW for the file, the range that the failure record pointed to is mapped to a new place, so it is invalid. - If we set no data COW for the file, and if there is no error during writting, the corrupted data is corrected, so the failure record can be removed. And if some errors happen on the mirrors, we also needn't worry about it because the failure record will be recreated if we read the same place again. Sometimes, we may fail to correct the data, so the failure records will be left in the tree, we need free them when we free the inode or the memory leak happens. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
* Btrfs: implement repair function when direct read failsMiao Xie2014-09-171-1/+4
| | | | | | | | | | | | | | | | | | | | This patch implement data repair function when direct read fails. The detail of the implementation is: - When we find the data is not right, we try to read the data from the other mirror. - When the io on the mirror ends, we will insert the endio work into the dedicated btrfs workqueue, not common read endio workqueue, because the original endio work is still blocked in the btrfs endio workqueue, if we insert the endio work of the io on the mirror into that workqueue, deadlock would happen. - After we get right data, we write it back to the corrupted mirror. - And if the data on the new mirror is still corrupted, we will try next mirror until we read right data or all the mirrors are traversed. - After the above work, we set the uptodate flag according to the result. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
* Btrfs: modify clean_io_failure and make it suit direct ioMiao Xie2014-09-171-3/+3
| | | | | | | | | | We could not use clean_io_failure in the direct IO path because it got the filesystem information from the page structure, but the page in the direct IO bio didn't have the filesystem information in its structure. So we need modify it and pass all the information it need by parameters. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
* Btrfs: modify repair_io_failure and make it suit direct ioMiao Xie2014-09-171-1/+1
| | | | | | | | | | | | | The original code of repair_io_failure was just used for buffered read, because it got some filesystem data from page structure, it is safe for the page in the page cache. But when we do a direct read, the pages in bio are not in the page cache, that is there is no filesystem data in the page structure. In order to implement direct read data repair, we need modify repair_io_failure and pass all filesystem data it need by function parameters. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
* Btrfs: split bio_readpage_error into several functionsMiao Xie2014-09-171-0/+28
| | | | | | | | | The data repair function of direct read will be implemented later, and some code in bio_readpage_error will be reused, so split bio_readpage_error into several functions which will be used in direct read repair later. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
* Btrfs: shrink further sizeof(struct extent_buffer)Filipe Manana2014-09-171-2/+0Star
| | | | | | | | | | | The map_start and map_len fields aren't used anywhere, so just remove them. On a x86_64 system, this reduced sizeof(struct extent_buffer) from 296 bytes to 280 bytes, and therefore 14 extent_buffer structs can now fit into a page instead of 13. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
* Btrfs: reduce size of struct extent_stateFilipe Manana2014-09-171-1/+0Star
| | | | | | | | | | | | | The tree field of struct extent_state was only used to figure out if an extent state was connected to an inode's io tree or not. For this we can just use the rb_node field itself. On a x86_64 system with this change the sizeof(struct extent_state) is reduced from 96 bytes down to 88 bytes, meaning that with a page size of 4096 bytes we can now store 46 extent states per page instead of 42. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
* Btrfs: remove unused wait queue in struct extent_bufferFilipe Manana2014-06-191-1/+0Star
| | | | | | | | | | The lock_wq wait queue is not used anywhere, therefore just remove it. On a x86_64 system, this reduced sizeof(struct extent_buffer) from 320 bytes down to 296 bytes, which means a 4Kb page can now be used for 13 extent buffers instead of 12. Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>