Cross Reference: /drivers/md/bcache/btree.c

History log of /drivers/md/bcache/btree.c
Revision	Date	Author	Comments
2452cc89063a2a6890368f185c4b6d7d8802179e	12-Jul-2014	Slava Pestov <sp@daterainc.com>	bcache: try to set b->parent properly bcache_flash_dev.ktest would reliably crash with 8k and 16k bucket size before; now it passes. Change-Id: Ib542232235e39298c3a7548fe52b645cabb823d1
400ffaa2acd72274e2c7293a9724382383bebf3e	13-Jul-2014	Slava Pestov <sp@daterainc.com>	bcache: fix use-after-free in btree_gc_coalesce() If we goto out_nocoalesce after we free new_nodes[0], we end up freeing new_nodes[0] again. This was generating a lockdep warning. The fix is to set new_nodes[0] to NULL, since the out_nocoalesce path safely ignores NULL entries in the new_nodes array. This regression was introduced in 2d7f9531. Change-Id: I76564d7257800583214376b4bacf236cda90c89c
913dc33fb2720fb5f979011664294137ddd8b13b	23-May-2014	Slava Pestov <sp@daterainc.com>	bcache: fix crash in bcache_btree_node_alloc_fail tracepoint 'b' was NULL. Change-Id: Icac0fd04afa2d23f213d96d51afd53374e6dd0c0
501d52a90cbe652b41336c206ff0e95799d5a9b5	19-May-2014	Kent Overstreet <kmo@daterainc.com>	bcache: Allocate bounce buffers with GFP_NOWAIT There's no point in blocking on these allocations, since our fallback paths will probably go faster than blocking. Change-Id: I733ca202c25cb36bde02607a0a60552229a4241c
bcf090e0040e30f8409e6a535a01e6473afb096f	19-May-2014	Kent Overstreet <kmo@daterainc.com>	bcache: Make sure to pass GFP_WAIT to mempool_alloc() this was very wrong - mempool_alloc() only guarantees success with GFP_WAIT. bcache uses GFP_NOWAIT in various other places where we have a fallback, circuits must've gotten crossed when writing this code or something. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
c5aa4a3157b55bdca18dd2a9d9f43314470b6d32	22-Apr-2014	Slava Pestov <sp@daterainc.com>	bcache: wait for buckets when allocating new btree root Tested: - sometimes bcache_tier test would hang on startup with a failure to allocate the btree root -- no longer seeing this Signed-off-by: Kent Overstreet <kmo@daterainc.com>
3a2fd9d5090b83aab85378a846fa10f39b0b5aa7	28-Feb-2014	Kent Overstreet <kmo@daterainc.com>	bcache: Kill bucket->gc_gen gc_gen was a temporary used to recalculate last_gc, but since we only need bucket->last_gc when gc isn't running (gc_mark_valid = 1), we can just update last_gc directly. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2531d9ee61fa08a5a9ab8f002c50779888d232c7	18-Mar-2014	Kent Overstreet <kmo@daterainc.com>	bcache: Kill unused freelist This was originally added as at optimization that for various reasons isn't needed anymore, but it does add a lot of nasty corner cases (and it was responsible for some recently fixed bugs). Just get rid of it now. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
0a63b66db566cffdf90182eb6e66fdd4d0479e63	18-Mar-2014	Kent Overstreet <kmo@daterainc.com>	bcache: Rework btree cache reserve handling This changes the bucket allocation reserves to use _real_ reserves - separate freelists - instead of watermarks, which if nothing else makes the current code saner to reason about and is going to be important in the future when we add support for multiple btrees. It also adds btree_check_reserve(), which checks (and locks) the reserves for both bucket allocation and memory allocation for btree nodes; the old code just kinda sorta assumed that since (e.g. for btree node splits) it had the root locked and that meant no other threads could try to make use of the same reserve; this technically should have been ok for memory allocation (we should always have a reserve for memory allocation (the btree node cache is used as a reserve and we preallocate it)), but multiple btrees will mean that locking the root won't be sufficient anymore, and for the bucket allocation reserve it was technically possible for the old code to deadlock. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
56b30770b27d54d68ad51eccc6d888282b568cee	23-Jan-2014	Kent Overstreet <kmo@daterainc.com>	bcache: Kill btree_io_wq With the locking rework in the last patch, this shouldn't be needed anymore - btree_node_write_work() only takes b->write_lock which is never held for very long. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2a285686c109816ba71a00b9278262cf02648258	05-Mar-2014	Kent Overstreet <kmo@daterainc.com>	bcache: btree locking rework Add a new lock, b->write_lock, which is required to actually modify - or write - a btree node; this lock is only held for short durations. This means we can write out a btree node without taking b->lock, which _is_ held for long durations - solving a deadlock when btree_flush_write() (from the journalling code) is called with a btree node locked. Right now just occurs in bch_btree_set_root(), but with an upcoming journalling rework is going to happen a lot more. This also turns b->lock is now more of a read/intent lock instead of a read/write lock - but not completely, since it still blocks readers. May turn it into a real intent lock at some point in the future. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
05335cff9f01555b769ac97b7bacc472b7ed047a	18-Mar-2014	Kent Overstreet <kmo@daterainc.com>	bcache: Fix a race when freeing btree nodes This isn't a bulletproof fix; btree_node_free() -> bch_bucket_free() puts the bucket on the unused freelist, where it can be reused right away without any ordering requirements. It would be better to wait on at least a journal write to go down before reusing the bucket. bch_btree_set_root() does this, and inserting into non leaf nodes is completely synchronous so we should be ok, but future patches are just going to get rid of the unused freelist - it was needed in the past for various reasons but shouldn't be anymore. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
4fe6a816707aace9e8e297b708411c5930537793	13-Mar-2014	Kent Overstreet <kmo@daterainc.com>	bcache: Add a real GC_MARK_RECLAIMABLE This means the garbage collection code can better check for data and metadata pointers to the same buckets. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
3f5e0a34daed197aa55d0c6b466bb4cd03babb4f	23-Jan-2014	Kent Overstreet <kmo@daterainc.com>	bcache: Kill dead cgroup code This hasn't been used or even enabled in ages. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
487dded86ea065317aea121bec8f1816f2f235c9	17-Mar-2014	Kent Overstreet <kmo@daterainc.com>	bcache: Fix another bug recovering from unclean shutdown The on disk bucket gens are allowed to be out of date, when we reuse buckets that didn't have any live data in them. To deal with this, the initial gc has to update the bucket gen when we find a pointer gen newer than the bucket's gen. Unfortunately we weren't doing this for pointers in the journal that we're about to replay. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
0bd143fd800055b1db756693289bbebdb93f2a73	05-Mar-2014	Kent Overstreet <kmo@daterainc.com>	bcache: Fix a bug recovering from unclean shutdown The code to fixup incorrect bucket prios incorrectly did not skip btree node freeing keys Signed-off-by: Kent Overstreet <kmo@daterainc.com>
3572324af0f4ef877545e5a17bd3e788551f166a	11-Jan-2014	Kent Overstreet <kmo@daterainc.com>	bcache: Minor fixes from kbuild robot Signed-off-by: Kent Overstreet <kmo@daterainc.com>
947174476701fbc84ea8c7ec9664270f9d80b076	29-Jan-2014	Darrick J. Wong <darrick.wong@oracle.com>	bcache: fix BUG_ON due to integer overflow with GC_SECTORS_USED The BUG_ON at the end of __bch_btree_mark_key can be triggered due to an integer overflow error: BITMASK(GC_SECTORS_USED, struct bucket, gc_mark, 2, 13); ... SET_GC_SECTORS_USED(g, min_t(unsigned, GC_SECTORS_USED(g) + KEY_SIZE(k), (1 << 14) - 1)); BUG_ON(!GC_SECTORS_USED(g)); In bcache.h, the SECTORS_USED bitfield is defined to be 13 bits wide. While the SET_ code tries to ensure that the field doesn't overflow by clamping it to (1<<14)-1 == 16383, this is incorrect because 16383 requires 14 bits. Therefore, if GC_SECTORS_USED() + KEY_SIZE() = 8192, the SET_ statement tries to store 8192 into a 13-bit field. In a 13-bit field, 8192 becomes zero, thus triggering the BUG_ON. Therefore, create a field width constant and a max value constant, and use those to create the bitfield and check the inputs to SET_GC_SECTORS_USED. Arguably the BITMASK() template ought to have BUG_ON checks for too-large values, but that's a separate patch. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
3b3e9e50dd951725130645660b526c4f367dcdee	07-Dec-2013	Kent Overstreet <kmo@daterainc.com>	bcache: Don't return -EINTR when insert finished We need to return -EINTR after a split because we invalidated iterators (and freed the btree node) - but if we were finished inserting, we don't want to redo the traversal. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
829a60b9055c319f3656a01eb8cb78b1b86232ef	12-Nov-2013	Kent Overstreet <kmo@daterainc.com>	bcache: Move insert_fixup() to btree_keys_ops Now handling overlapping extents/keys is a method that's specific to what the btree node contains. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
89ebb4a28ba9efb5c9b18ba552e784021957b14a	12-Nov-2013	Kent Overstreet <kmo@daterainc.com>	bcache: Convert sorting to btree_keys More work to disentangle various code from struct btree Signed-off-by: Kent Overstreet <kmo@daterainc.com>
dc9d98d621bdce0552997200ce855659875a5c9f	18-Dec-2013	Kent Overstreet <kmo@daterainc.com>	bcache: Convert debug code to btree_keys More work to disentangle various code from struct btree Signed-off-by: Kent Overstreet <kmo@daterainc.com>
c052dd9a26f60bcf70c0c3fcc08e07abb60295cd	12-Nov-2013	Kent Overstreet <kmo@daterainc.com>	bcache: Convert btree_iter to struct btree_keys More work to disentangle bset.c from struct btree Signed-off-by: Kent Overstreet <kmo@daterainc.com>
59158fde429fb5d18064e2734b3dd5e6048affbd	12-Nov-2013	Kent Overstreet <kmo@daterainc.com>	bcache: Add bch_btree_keys_u64s_remaining() Helper function to explicitly check how much space is free in a btree node Signed-off-by: Kent Overstreet <kmo@daterainc.com>
a85e968e66a175c86d0410719ea84a5bd0f1d070	21-Dec-2013	Kent Overstreet <kmo@daterainc.com>	bcache: Add struct btree_keys Soon, bset.c won't need to depend on struct btree. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
65d45231b56efb3db51eb441e2c68f8252ecdd12	21-Dec-2013	Kent Overstreet <kmo@daterainc.com>	bcache: Abstract out stuff needed for sorting Signed-off-by: Kent Overstreet <kmo@daterainc.com>
ee811287c9f241641899788cbfc9d70ed96ba3a5	18-Dec-2013	Kent Overstreet <kmo@daterainc.com>	bcache: Rename/shuffle various code around More work to disentangle bset.c from the rest of the code: Signed-off-by: Kent Overstreet <kmo@daterainc.com>
67539e85289c14a76a1c4162613d14a5f05a0027	11-Sep-2013	Kent Overstreet <kmo@daterainc.com>	bcache: Add struct bset_sort_state More disentangling bset.c from the rest of the bcache code - soon, the sorting routines won't have any dependencies on any outside structs. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
911c9610099f26e9e6ea3d1962ce24f53890b163	29-Jul-2013	Kent Overstreet <kmo@daterainc.com>	bcache: Split out sort_extent_cmp() Only use extent comparison for comparing extents, so we're not using START_KEY() on other key types (i.e. btree pointers) Signed-off-by: Kent Overstreet <kmo@daterainc.com>
fafff81cead78157099df1ee10af16cc51893ddc	18-Dec-2013	Kent Overstreet <kmo@daterainc.com>	bcache: Bkey indexing renaming More refactoring: node() -> bset_bkey_idx() end() -> bset_bkey_last() Signed-off-by: Kent Overstreet <kmo@daterainc.com>
085d2a3dd4d65b7bce1dead987c647dbbc014281	12-Nov-2013	Kent Overstreet <kmo@daterainc.com>	bcache: Make bch_keylist_realloc() take u64s, not nptrs Getting away from KEY_PTRS and moving toward KEY_U64s - and getting rid of magic 2s Also - split out the part that checks against journal entry size so as to avoid a dependancy on struct cache_set in bset.c Signed-off-by: Kent Overstreet <kmo@daterainc.com>
78b77bf8b20431f8ad8a4db7e3120103bd922337	18-Dec-2013	Kent Overstreet <kmo@daterainc.com>	bcache: Btree verify code improvements Used this fixed code to find and fix the bug fixed by a4d885097b0ac0cd1337f171f2d4b83e946094d4. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
88b9f8c426f35e04738220c1bc05dd1ea1b513a3	18-Dec-2013	Kent Overstreet <kmo@daterainc.com>	bcache: kill index() That was a terrible name for a macro, add some better helpers to replace it. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
5f5837d2d650db25b9153b91535e67a96b265f58	17-Dec-2013	Kent Overstreet <kmo@daterainc.com>	bcache: Do bkey_put() in btree_split() error path This error path shouldn't have been hit in practice.. and we've got reworked reserve code coming soon so that it shouldn't _ever_ be bit... but if we've got code for this error path it should be correct. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
78365411b344df35a198b119133e6515c2dcfb9f	17-Dec-2013	Kent Overstreet <kmo@daterainc.com>	bcache: Rework allocator reserves We need a reserve for allocating buckets for new btree nodes - and now that we've got multiple btrees, it really needs to be per btree. This reworks the reserves so we've got separate freelists for each reserve instead of watermarks, which seems to make things a bit cleaner, and it adds some code so that btree_split() can make sure the reserve is available before it starts. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
cb7a583e6a6ace661a5890803e115d2292a293df	17-Dec-2013	Kent Overstreet <kmo@daterainc.com>	bcache: kill closure locking usage Signed-off-by: Kent Overstreet <kmo@daterainc.com>
b0f32a56f27eb0df4124dbfc8eb6f09f423eed99	10-Dec-2013	Kent Overstreet <kmo@daterainc.com>	bcache: Minor btree cache fix Signed-off-by: Kent Overstreet <kmo@daterainc.com>
bf0a628a95dba7f983b6047cea695fb066fb2512	27-Nov-2013	Nicholas Swenson <nks@daterainc.com>	bcache: fix for gc and writeback race Garbage collector needs to check keys in the writeback keybuf to make sure it's not invalidating buckets to which the writeback keys point to. Signed-off-by: Nicholas Swenson <nks@daterainc.com> Signed-off-by: Kent Overstreet <kmo@daterainc.com>
d24a6e1087030b6da286df9433add5fa2f21b83b	11-Nov-2013	Kent Overstreet <kmo@daterainc.com>	bcache: Fix dirty_data accounting Dirty data accounting wasn't quite right - firstly, we were adding the key we're inserting after it could have merged with another dirty key already in the btree, and secondly we could sometimes pass the wrong offset to bcache_dev_sectors_dirty_add() for dirty data we were overwriting - which is important when tracking dirty data by stripe. NOTE FOR BACKPORTERS: For 3.10 (and 3.11?) there's other accounting fixes necessary that got squashed in with other patches; the full patch against 3.10 is 408cc2f47eeac93a, available at: git://evilpiepirate.org/~kent/linux-bcache.git bcache-3.10-writeback-fixes Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10 diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c index 2a46036..4a12b2f 100644 --- a/drivers/md/bcache/btree.c +++ b/drivers/md/bcache/btree.c @@ -1817,7 +1817,8 @@ static bool fix_overlapping_extents(struct btree b, struct bkey insert, if (KEY_START(k) > KEY_START(insert) + sectors_found) goto check_failed; - if (KEY_PTRS(replace_key) != KEY_PTRS(k)) + if (KEY_PTRS(k) != KEY_PTRS(replace_key) \|\| + KEY_DIRTY(k) != KEY_DIRTY(replace_key)) goto check_failed; /* skip past gen */
08239ca2a053dbc3b082916bdfbd88e5a9ad9267	28-Nov-2013	Wei Yongjun <yongjun_wei@trendmicro.com.cn>	bcache: fix sparse non static symbol warning Fixes the following sparse warning: drivers/md/bcache/btree.c:2220:5: warning: symbol 'btree_insert_fn' was not declared. Should it be static? Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Kent Overstreet <kmo@daterainc.com>