[go: nahoru, domu]

History log of /drivers/md/dm-cache-policy-mq.c
Revision Date Author Comments
14f398ca2f26a2ed6236aec54395e0fa06ec8a82 28-Feb-2014 Heinz Mauelshagen <heinzm@redhat.com> dm cache mq: fix memory allocation failure for large cache devices

The memory allocated for the multiqueue policy's hash table doesn't need
to be physically contiguous. Use vzalloc() instead of kzalloc().
Fedora has been carrying this fix since 10/10/2013.

Failure seen during creation of a 10TB cached device with a 2048 sector
block size and 411GB cache size:

dmsetup: page allocation failure: order:9, mode:0x10c0d0
CPU: 11 PID: 29235 Comm: dmsetup Not tainted 3.10.4 #3
Hardware name: Supermicro X8DTL/X8DTL, BIOS 2.1a 12/30/2011
000000000010c0d0 ffff880090941898 ffffffff81387ab4 ffff880090941928
ffffffff810bb26f 0000000000000009 000000000010c0d0 ffff880090941928
ffffffff81385dbc ffffffff815f3840 ffffffff00000000 000002000010c0d0
Call Trace:
[<ffffffff81387ab4>] dump_stack+0x19/0x1b
[<ffffffff810bb26f>] warn_alloc_failed+0x110/0x124
[<ffffffff81385dbc>] ? __alloc_pages_direct_compact+0x17c/0x18e
[<ffffffff810bda2e>] __alloc_pages_nodemask+0x6c7/0x75e
[<ffffffff810bdad7>] __get_free_pages+0x12/0x3f
[<ffffffff810ea148>] kmalloc_order_trace+0x29/0x88
[<ffffffff810ec1fd>] __kmalloc+0x36/0x11b
[<ffffffffa031eeed>] ? mq_create+0x1dc/0x2cf [dm_cache_mq]
[<ffffffffa031efc0>] mq_create+0x2af/0x2cf [dm_cache_mq]
[<ffffffffa0314605>] dm_cache_policy_create+0xa7/0xd2 [dm_cache]
[<ffffffffa0312530>] ? cache_ctr+0x245/0xa13 [dm_cache]
[<ffffffffa031263e>] cache_ctr+0x353/0xa13 [dm_cache]
[<ffffffffa012b916>] dm_table_add_target+0x227/0x2ce [dm_mod]
[<ffffffffa012e8e4>] table_load+0x286/0x2ac [dm_mod]
[<ffffffffa012e65e>] ? dev_wait+0x8a/0x8a [dm_mod]
[<ffffffffa012e324>] ctl_ioctl+0x39a/0x3c2 [dm_mod]
[<ffffffffa012e35a>] dm_ctl_ioctl+0xe/0x12 [dm_mod]
[<ffffffff81101181>] vfs_ioctl+0x21/0x34
[<ffffffff811019d3>] do_vfs_ioctl+0x3b1/0x3f4
[<ffffffff810f4d2e>] ? ____fput+0x9/0xb
[<ffffffff81050b6c>] ? task_work_run+0x7e/0x92
[<ffffffff81101a68>] SyS_ioctl+0x52/0x82
[<ffffffff81391d92>] system_call_fastpath+0x16/0x1b

Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
2e68c4e6caad9fdadc1cef8b6cb9569192e8a42b 16-Jan-2014 Mike Snitzer <snitzer@redhat.com> dm cache: add policy name to status output

The cache's policy may have been established using the "default" alias,
which is currently the "mq" policy but the default policy may change in
the future. It is useful to know exactly which policy is being used.

Add a 'real' member to the dm_cache_policy_type structure and have the
"default" dm_cache_policy_type point to the real "mq"
dm_cache_policy_type. Update dm_cache_policy_get_name() to check if
real is set, if so report the name of the real policy (not the alias).

Requested-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
78e03d69733c48312ae81fe4ac0790dbea412b9d 09-Dec-2013 Joe Thornber <ejt@redhat.com> dm cache policy mq: introduce three promotion threshold tunables

Internally the mq policy maintains a promotion threshold variable. If
the hit count of a block not in the cache goes above this threshold it
gets promoted to the cache.

This patch introduces three new tunables that allow you to tweak the
promotion threshold by adding a small value. These adjustments depend
on the io type:

read_promote_adjustment: READ io, default 4
write_promote_adjustment: WRITE io, default 8
discard_promote_adjustment: READ/WRITE io to a discarded block, default 1

If you're trying to quickly warm a new cache device you may wish to
reduce these to encourage promotion. Remember to switch them back to
their defaults after the cache fills though.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
b815805154cc62debbc423a6c27ae39290b300ae 18-Nov-2013 Wei Yongjun <yongjun_wei@trendmicro.com.cn> dm cache policy mq: use list_del_init instead of list_del + INIT_LIST_HEAD

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
af95e7a69b54bca48092e3013a92cfa3043c9c73 15-Nov-2013 Joe Thornber <ejt@redhat.com> dm cache policy mq: fix promotions to occur as expected

Micro benchmarks that repeatedly issued IO to a single block were
failing to cause a promotion from the origin device to the cache. Fix
this by not updating the stats during map() if -EWOULDBLOCK will be
returned.

The mq policy will only update stats, consider migration, etc, once per
tick period (a unit of time established between dm-cache core and the
policies).

When the IO thread calls the policy's map method, if it would like to
migrate the associated block it returns -EWOULDBLOCK, the IO then gets
handed over to a worker thread which handles the migration. The worker
thread calls map again, to check the migration is still needed (avoids a
race among other things). *BUT*, before this fix, if we were still in
the same tick period the stats were already updated by the previous map
call -- so the migration would no longer be requested.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4f024f3797c43cb4b73cd2c50cec728842d0e49e 12-Oct-2013 Kent Overstreet <kmo@daterainc.com> block: Abstract out bvec iterator

Immutable biovecs are going to require an explicit iterator. To
implement immutable bvecs, a later patch is going to add a bi_bvec_done
member to this struct; for now, this patch effectively just renames
things.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Yehuda Sadeh <yehuda@inktank.com>
Cc: Sage Weil <sage@inktank.com>
Cc: Alex Elder <elder@inktank.com>
Cc: ceph-devel@vger.kernel.org
Cc: Joshua Morris <josh.h.morris@us.ibm.com>
Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: dm-devel@redhat.com
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: Boaz Harrosh <bharrosh@panasas.com>
Cc: Benny Halevy <bhalevy@tonian.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Chris Mason <chris.mason@fusionio.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Dave Kleikamp <shaggy@kernel.org>
Cc: Joern Engel <joern@logfs.org>
Cc: Prasad Joshi <prasadjoshi.linux@gmail.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Ben Myers <bpm@sgi.com>
Cc: xfs@oss.sgi.com
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Jerome Marchand <jmarchand@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Peng Tao <tao.peng@emc.com>
Cc: Andy Adamson <andros@netapp.com>
Cc: fanchaoting <fanchaoting@cn.fujitsu.com>
Cc: Jie Liu <jeff.liu@oracle.com>
Cc: Sunil Mushran <sunil.mushran@gmail.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Namjae Jeon <namjae.jeon@samsung.com>
Cc: Pankaj Kumar <pankaj.km@samsung.com>
Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Mel Gorman <mgorman@suse.de>6
7b6b2bc98c0303b7f043ad5b35906f833e56308d 12-Nov-2013 Mike Snitzer <snitzer@redhat.com> dm cache: resolve small nits and improve Documentation

Document passthrough mode, cache shrinking, and cache invalidation.
Also, use strcasecmp() and hlist_unhashed().

Reported-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
532906aa7f9656209f30f08dfadd328fc1bc6912 08-Nov-2013 Joe Thornber <ejt@redhat.com> dm cache: add remove_cblock method to policy interface

Implement policy_remove_cblock() and add remove_cblock method to the mq
policy. These methods will be used by the following cache block
invalidation patch which adds the 'invalidate_cblocks' message to the
cache core.

Also, update some comments in dm-cache-policy.h

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
633618e3353f8953e43d989d08302f5dcd51d8be 09-Nov-2013 Joe Thornber <ejt@redhat.com> dm cache policy mq: reduce memory requirements

Rather than storing the cblock in each cache entry, we allocate all
entries in an array and infer the cblock from the entry position.

Saves 4 bytes of memory per cache block. In addition, this gives us an
easy way of looking up cache entries by cblock.

We no longer need to keep an explicit bitset to track which cblocks
have been allocated. And no searching is needed to find free cblocks.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
c86c30706caa02ffe303e6b87d53ef6a077d4cca 24-Oct-2013 Joe Thornber <ejt@redhat.com> dm cache: be much more aggressive about promoting writes to discarded blocks

Previously these promotions only got priority if there were unused cache
blocks. Now we give them priority if there are any clean blocks in the
cache.

The fio_soak_test in the device-mapper-test-suite now gives uniform
performance across subvolumes (~16 seconds).

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
01911c19bea63b1a958b9d9024504c2e9079f155 24-Oct-2013 Joe Thornber <ejt@redhat.com> dm cache policy mq: implement writeback_work() and mq_{set,clear}_dirty()

There are now two multiqueues for in cache blocks. A clean one and a
dirty one.

writeback_work comes from the dirty one. Demotions come from the clean
one.

There are two benefits:
- Performance improvement, since demoting a clean block is a noop.
- The cache cleans itself when io load is light.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
0184b44e321dda893d4d4be33499d404718c3a86 24-Oct-2013 Joe Thornber <ejt@redhat.com> dm cache policy mq: a few small fixes

Rename takeout_queue to concat_queue.

Fix a harmless bug in mq policies pop() function. Currently pop()
always succeeds, with up coming changes this wont be the case.

Fix typo in comment above pre_cache_to_cache prototype.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
99ba2ae4cd876bbcedb01d94c1a7952ce171418e 21-Oct-2013 Joe Thornber <ejt@redhat.com> dm cache policy mq: protect residency method with existing mutex

It is safe to use a mutex in mq_residency() at this point since it is
only called from ioctl context. But future-proof mq_residency() by
using might_sleep() to catch new contexts that cannot sleep.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
b936bf8b785f0fbe083d203049e4da1c56ec788f 26-Jul-2013 Geert Uytterhoeven <geert@linux-m68k.org> dm cache: avoid conflicting remove_mapping() in mq policy

On sparc32, which includes <linux/swap.h> from <asm/pgtable_32.h>:

drivers/md/dm-cache-policy-mq.c:962:13: error: conflicting types for 'remove_mapping'
include/linux/swap.h:285:12: note: previous declaration of 'remove_mapping' was here

As mq_remove_mapping() already exists, and the local remove_mapping() is
used only once, inline it manually to avoid the conflict.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair Kergon <agk@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
4e7f506f6429636115e2f58f9f97089acc62524a 20-Mar-2013 Mike Snitzer <snitzer@redhat.com> dm cache: policy change version from string to integer set

Separate dm cache policy version string into 3 unsigned numbers
corresponding to major, minor and patchlevel and store them at the end
of the on-disk metadata so we know which version of the policy generated
the hints in case a future version wants to use them differently.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
f283635281132af7bc7b90af3c105b8c0f73b9c7 01-Mar-2013 Joe Thornber <ejt@redhat.com> dm cache: add mq policy

A cache policy that uses a multiqueue ordered by recent hit
count to select which blocks should be promoted and demoted.
This is meant to be a general purpose policy. It prioritises
reads over writes.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>