android_kernel_msm-6.1_noth.../include
David Rientjes 30467e0b3b mm, hotplug: fix concurrent memory hot-add deadlock
There's a deadlock when concurrently hot-adding memory through the probe
interface and switching a memory block from offline to online.

When hot-adding memory via the probe interface, add_memory() first takes
mem_hotplug_begin() and then device_lock() is later taken when registering
the newly initialized memory block.  This creates a lock dependency of (1)
mem_hotplug.lock (2) dev->mutex.

When switching a memory block from offline to online, dev->mutex is first
grabbed in device_online() when the write(2) transitions an existing
memory block from offline to online, and then online_pages() will take
mem_hotplug_begin().

This creates a lock inversion between mem_hotplug.lock and dev->mutex.
Vitaly reports that this deadlock can happen when kworker handling a probe
event races with systemd-udevd switching a memory block's state.

This patch requires the state transition to take mem_hotplug_begin()
before dev->mutex.  Hot-adding memory via the probe interface creates a
memory block while holding mem_hotplug_begin(), there is no way to take
dev->mutex first in this case.

online_pages() and offline_pages() are only called when transitioning
memory block state.  We now require that mem_hotplug_begin() is taken
before calling them -- this requires exporting the mem_hotplug_begin() and
mem_hotplug_done() to generic code.  In all hot-add and hot-remove cases,
mem_hotplug_begin() is done prior to device_online().  This is all that is
needed to avoid the deadlock.

Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zhang Zhen <zhenzhang.zhang@huawei.com>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-14 16:49:00 -07:00
..
acpi Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux 2015-02-19 11:28:36 -08:00
asm-generic OK, this has the big virtio 1.0 implementation, as specified by OASIS. 2015-02-18 09:24:01 -08:00
clocksource
crypto Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2015-02-14 09:47:01 -08:00
drm drm/ttm: device address space != CPU address space 2015-03-05 09:04:39 +10:00
dt-bindings USB patches for 4.1-rc1 2015-04-13 17:07:21 -07:00
keys
kvm KVM/ARM changes for v4.1: 2015-04-07 18:09:20 +02:00
linux mm, hotplug: fix concurrent memory hot-add deadlock 2015-04-14 16:49:00 -07:00
math-emu
media [media] media: atmel-isi: increase the burst length to improve the performance 2015-03-02 13:27:11 -03:00
memory
misc
net ipv6: protect skb->sk accesses from recursive dereference inside the stack 2015-04-06 16:12:49 -04:00
pcmcia
ras
rdma
rxrpc
scsi libata-eh: Set 'information' field for autosense 2015-03-27 11:59:22 -04:00
soc pm: at91: Workaround DDRSDRC self-refresh bug with LPDDR1 memories. 2015-03-03 19:43:59 +01:00
sound
target target: do not reject FUA CDBs when write cache is enabled but emulate_write_cache is 0 2015-03-19 23:26:46 -07:00
trace Merge branch 'for-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata 2015-04-13 16:42:16 -07:00
uapi Staging driver patches for 4.1-rc1 2015-04-13 17:37:33 -07:00
video OMAPDSS: fix regression with display sysfs files 2015-02-26 10:23:15 +02:00
xen xen: Remove trailing semicolon from xenbus_register_frontend() definition 2015-03-02 10:38:59 +00:00
Kbuild