summaryrefslogtreecommitdiffstats
path: root/drivers/md/bcache/bcache.h
Commit message (Collapse)AuthorAgeFilesLines
...
* bcache: Debug code improvementsKent Overstreet2013-11-111-9/+1Star
| | | | | | | | | | | | | | | Couple changes: * Consolidate bch_check_keys() and bch_check_key_order(), and move the checks that only check_key_order() could do to bch_btree_iter_next(). * Get rid of CONFIG_BCACHE_EDEBUG - now, all that code is compiled in when CONFIG_BCACHE_DEBUG is enabled, and there's now a sysfs file to flip on the EDEBUG checks at runtime. * Dropped an old not terribly useful check in rw_unlock(), and refactored/improved a some of the other debug code. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
* bcache: Pull on disk data structures out into a separate headerKent Overstreet2013-11-111-242/+2Star
| | | | | | | Now, the on disk data structures are in a header that can be exported to userspace - and having them all centralized is nice too. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
* bcache: Move sector allocator to alloc.cKent Overstreet2013-11-111-0/+4
| | | | | | Just reorganizing things a bit. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
* bcache: Add btree_map() functionsKent Overstreet2013-11-111-2/+0Star
| | | | | | | | | | | | | | Lots of stuff has been open coding its own btree traversal - which is generally pretty simple code, but there are a few subtleties. This adds new new functions, bch_btree_map_nodes() and bch_btree_map_keys(), which do the traversal for you. Everything that's open coding btree traversal now (with the exception of garbage collection) is slowly going to be converted to these two functions; being able to write other code at a higher level of abstraction is a big improvement w.r.t. overall code quality. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
* bcache: Convert writeback to a kthreadKent Overstreet2013-11-111-4/+6
| | | | | | | | This simplifies the writeback flow control quite a bit - previously, it was conceptually two coroutines, refill_dirty() and read_dirty(). This makes the code quite a bit more straightforward. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
* bcache: Convert gc to a kthreadKent Overstreet2013-11-111-5/+4Star
| | | | | | | | | We needed a dedicated rescuer workqueue for gc anyways... and gc was conceptually a dedicated thread, just one that wasn't running all the time. Switch it to a dedicated thread to make the code a bit more straightforward. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
* bcache: Convert bucket_wait to wait_queue_head_tKent Overstreet2013-11-111-4/+4
| | | | | | | | At one point we did do fancy asynchronous waiting stuff with bucket_wait, but that's all gone (and bucket_wait is used a lot less than it used to be). So use the standard primitives. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
* bcache: Convert try_wait to wait_queue_head_tKent Overstreet2013-11-111-2/+2
| | | | | | | We never waited on c->try_wait asynchronously, so just use the standard primitives. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
* bcache: Convert btree_insert_check_key() to btree_insert_node()Kent Overstreet2013-11-111-8/+0Star
| | | | | | | | This was the main point of all this refactoring - now, btree_insert_check_key() won't fail just because the leaf node happened to be full. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
* bcache: Stripe size isn't necessarily a power of twoKent Overstreet2013-11-111-1/+1
| | | | | | | | Originally I got this right... except that the divides didn't use do_div(), which broke 32 bit kernels. When I went to fix that, I forgot that the raid stripe size usually isn't a power of two... doh Signed-off-by: Kent Overstreet <kmo@daterainc.com>
* bcache: Add on error panic/unregister settingKent Overstreet2013-11-111-0/+6
| | | | | | | Works kind of like the ext4 setting, to panic or remount read only on errors. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
* bcache: Use blkdev_issue_discard()Kent Overstreet2013-11-111-10/+0Star
| | | | | | | | | The old asynchronous discard code was really a relic from when all the allocation code was asynchronous - now that allocation runs out of a dedicated thread there's no point in keeping around all that complicated machinery. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
* bcache: Fix a writeback performance regressionKent Overstreet2013-09-241-4/+3Star
| | | | | | | | | | | | | | | | | Background writeback works by scanning the btree for dirty data and adding those keys into a fixed size buffer, then for each dirty key in the keybuf writing it to the backing device. When read_dirty() finishes and it's time to scan for more dirty data, we need to wait for the outstanding writeback IO to finish - they still take up slots in the keybuf (so that foreground writes can check for them to avoid races) - without that wait, we'll continually rescan when we'll be able to add at most a key or two to the keybuf, and that takes locks that starves foreground IO. Doh. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* bcache: Allocation kthread fixesKent Overstreet2013-07-121-4/+0Star
| | | | | | | | The alloc kthread should've been using try_to_freeze() - and also there was the potential for the alloc kthread to get woken up after it had shut down, which would have been bad. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
* bcache: Fix a sysfs splat on shutdownKent Overstreet2013-07-121-0/+1
| | | | | | | | | | | | If we stopped a bcache device when we were already detaching (or something like that), bcache_device_unlink() would try to remove a symlink from sysfs that was already gone because the bcache dev kobject had already been removed from sysfs. So keep track of whether we've removed stuff from sysfs. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10
* bcache: Write out full stripesKent Overstreet2013-06-271-2/+1Star
| | | | | | | | | | | | | Now that we're tracking dirty data per stripe, we can add two optimizations for raid5/6: * If a stripe is already dirty, force writes to that stripe to writeback mode - to help build up full stripes of dirty data * When flushing dirty data, preferentially write out full stripes first if there are any. Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Track dirty data by stripeKent Overstreet2013-06-271-6/+4Star
| | | | | | | | To make background writeback aware of raid5/6 stripes, we first need to track the amount of dirty data within each stripe - we do this by breaking up the existing sectors_dirty into per stripe atomic_ts Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Initialize sectors_dirty when attachingKent Overstreet2013-06-271-1/+1
| | | | | | | | | | | Previously, dirty_data wouldn't get initialized until the first garbage collection... which was a bit of a problem for background writeback (as the PD controller keys off of it) and also confusing for users. This is also prep work for making background writeback aware of raid5/6 stripes. Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Improve lazy sortingKent Overstreet2013-06-271-0/+1
| | | | | | | | | The old lazy sorting code was kind of hacky - rewrite in a way that mathematically makes more sense; the idea is that the size of the sets of keys in a btree node should increase by a more or less fixed ratio from smallest to biggest. Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Fix/revamp tracepointsKent Overstreet2013-06-271-20/+0Star
| | | | | | | | | | | | | | | | The tracepoints were reworked to be more sensible, and fixed a null pointer deref in one of the tracepoints. Converted some of the pr_debug()s to tracepoints - this is partly a performance optimization; it used to be that with DEBUG or CONFIG_DYNAMIC_DEBUG pr_debug() was an empty macro; but at some point it was changed to an empty inline function. Some of the pr_debug() statements had rather expensive function calls as part of the arguments, so this code was getting run unnecessarily even on non debug kernels - in some fast paths, too. Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Refactor btree ioKent Overstreet2013-06-271-3/+2Star
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The most significant change is that btree reads are now done synchronously, instead of asynchronously and doing the post read stuff from a workqueue. This was originally done because we can't block on IO under generic_make_request(). But - we already have a mechanism to punt cache lookups to workqueue if needed, so if we just use that we don't have to deal with the complexity of doing things asynchronously. The main benefit is this makes the locking situation saner; we can hold our write lock on the btree node until we're finished reading it, and we don't need that btree_node_read_done() flag anymore. Also, for writes, btree_write() was broken out into btree_node_write() and btree_leaf_dirty() - the old code with the boolean argument was dumb and confusing. The prio_blocked mechanism was improved a bit too, now the only counter is in struct btree_write, we don't mess with transfering a count from struct btree anymore. This required changing garbage collection to block prios at the start and unblock when it finishes, which is cleaner than what it was doing anyways (the old code had mostly the same effect, but was doing it in a convoluted way) And the btree iter btree_node_read_done() uses was converted to a real mempool. Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Convert allocator thread to kthreadKent Overstreet2013-06-271-6/+11
| | | | | | Using a workqueue when we just want a single thread is a bit silly. Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Fix error handling in init codeKent Overstreet2013-05-151-1/+1
| | | | | | | This code appears to have rotted... fix various bugs and do some refactoring. Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Take data offset from the bdev superblock.Kent Overstreet2013-04-211-10/+37
| | | | | | | Add a new superblock version, and consolidate related defines. Signed-off-by: Gabriel de Perthuis <g2p.code+bcache@gmail.com> Signed-off-by: Kent Overstreet <koverstreet@google.com>
* bcache: Don't export utility code, prefix with bch_Kent Overstreet2013-03-281-1/+1
| | | | | | Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: linux-bcache@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
* bcache: Style/checkpatch fixesKent Overstreet2013-03-251-5/+5
| | | | | | | | | Took out some nested functions, and fixed some more checkpatch complaints. Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: linux-bcache@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
* bcache: A block layer cacheKent Overstreet2013-03-241-0/+1232
Does writethrough and writeback caching, handles unclean shutdown, and has a bunch of other nifty features motivated by real world usage. See the wiki at http://bcache.evilpiepirate.org for more. Signed-off-by: Kent Overstreet <koverstreet@google.com>