android_kernel_msm-6.1_noth.../include
Kent Overstreet df2cb6daa4 block: Avoid deadlocks with bio allocation by stacking drivers
Previously, if we ever try to allocate more than once from the same bio
set while running under generic_make_request() (i.e. a stacking block
driver), we risk deadlock.

This is because of the code in generic_make_request() that converts
recursion to iteration; any bios we submit won't actually be submitted
(so they can complete and eventually be freed) until after we return -
this means if we allocate a second bio, we're blocking the first one
from ever being freed.

Thus if enough threads call into a stacking block driver at the same
time with bios that need multiple splits, and the bio_set's reserve gets
used up, we deadlock.

This can be worked around in the driver code - we could check if we're
running under generic_make_request(), then mask out __GFP_WAIT when we
go to allocate a bio, and if the allocation fails punt to workqueue and
retry the allocation.

But this is tricky and not a generic solution. This patch solves it for
all users by inverting the previously described technique. We allocate a
rescuer workqueue for each bio_set, and then in the allocation code if
there are bios on current->bio_list we would be blocking, we punt them
to the rescuer workqueue to be submitted.

This guarantees forward progress for bio allocations under
generic_make_request() provided each bio is submitted before allocating
the next, and provided the bios are freed after they complete.

Note that this doesn't do anything for allocation from other mempools.
Instead of allocating per bio data structures from a mempool, code
should use bio_set's front_pad.

Tested it by forcing the rescue codepath to be taken (by disabling the
first GFP_NOWAIT) attempt, and then ran it with bcache (which does a lot
of arbitrary bio splitting) and verified that the rescuer was being
invoked.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Muthukumar Ratty <muthur@gmail.com>
2013-03-23 14:15:26 -07:00
..
acpi Fixes: 2013-03-12 20:25:53 -07:00
asm-generic asm-generic: move cmpxchg*_local defs to cmpxchg.h 2013-03-13 06:11:05 +01:00
clocksource ImgTec Meta architecture changes for v3.9-rc1 2013-03-03 12:06:09 -08:00
crypto
drm drm: Documentation typo fixes 2013-03-08 08:32:23 +10:00
keys
linux block: Avoid deadlocks with bio allocation by stacking drivers 2013-03-23 14:15:26 -07:00
math-emu
media [media] media: ov7670: Add possibility to disable pixclk during hblank 2013-02-08 14:35:06 -02:00
memory
misc
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-03-05 18:42:29 -08:00
pcmcia
ras edac: add support for error type "Info" 2013-02-21 14:16:27 -03:00
rdma IB/core: Add "type 2" memory windows support 2013-02-21 11:51:45 -08:00
rxrpc
scsi SCSI for-linus on 20130301 2013-03-02 11:42:16 -08:00
sound arm-soc: late OMAP changes 2013-02-28 20:00:40 -08:00
target target: Rename spc_get_write_same_sectors -> sbc_get_write_same_sectors 2013-02-23 12:46:14 -08:00
trace Various bug fixes for ext4. The most important is a fix for the new 2013-03-02 19:33:21 -08:00
uapi Merge branch 'akpm' (fixes from Andrew) 2013-03-13 15:21:57 -07:00
video UAPI disintegration 2012-12-20 2013-03-03 14:24:59 -08:00
xen xen: event channel arrays are xen_ulong_t and not unsigned long 2013-02-20 08:45:07 -05:00
Kbuild