android_kernel_msm-6.1_noth.../include
Frederic Weisbecker 2652d199dd workqueue: Provide one lock class key per work_on_cpu() callsite
[ Upstream commit 265f3ed077036f053981f5eea0b5b43e7c5b39ff ]

All callers of work_on_cpu() share the same lock class key for all the
functions queued. As a result the workqueue related locking scenario for
a function A may be spuriously accounted as an inversion against the
locking scenario of function B such as in the following model:

	long A(void *arg)
	{
		mutex_lock(&mutex);
		mutex_unlock(&mutex);
	}

	long B(void *arg)
	{
	}

	void launchA(void)
	{
		work_on_cpu(0, A, NULL);
	}

	void launchB(void)
	{
		mutex_lock(&mutex);
		work_on_cpu(1, B, NULL);
		mutex_unlock(&mutex);
	}

launchA and launchB running concurrently have no chance to deadlock.
However the above can be reported by lockdep as a possible locking
inversion because the works containing A() and B() are treated as
belonging to the same locking class.

The following shows an existing example of such a spurious lockdep splat:

	 ======================================================
	 WARNING: possible circular locking dependency detected
	 6.6.0-rc1-00065-g934ebd6e5359 #35409 Not tainted
	 ------------------------------------------------------
	 kworker/0:1/9 is trying to acquire lock:
	 ffffffff9bc72f30 (cpu_hotplug_lock){++++}-{0:0}, at: _cpu_down+0x57/0x2b0

	 but task is already holding lock:
	 ffff9e3bc0057e60 ((work_completion)(&wfc.work)){+.+.}-{0:0}, at: process_scheduled_works+0x216/0x500

	 which lock already depends on the new lock.

	 the existing dependency chain (in reverse order) is:

	 -> #2 ((work_completion)(&wfc.work)){+.+.}-{0:0}:
			__flush_work+0x83/0x4e0
			work_on_cpu+0x97/0xc0
			rcu_nocb_cpu_offload+0x62/0xb0
			rcu_nocb_toggle+0xd0/0x1d0
			kthread+0xe6/0x120
			ret_from_fork+0x2f/0x40
			ret_from_fork_asm+0x1b/0x30

	 -> #1 (rcu_state.barrier_mutex){+.+.}-{3:3}:
			__mutex_lock+0x81/0xc80
			rcu_nocb_cpu_deoffload+0x38/0xb0
			rcu_nocb_toggle+0x144/0x1d0
			kthread+0xe6/0x120
			ret_from_fork+0x2f/0x40
			ret_from_fork_asm+0x1b/0x30

	 -> #0 (cpu_hotplug_lock){++++}-{0:0}:
			__lock_acquire+0x1538/0x2500
			lock_acquire+0xbf/0x2a0
			percpu_down_write+0x31/0x200
			_cpu_down+0x57/0x2b0
			__cpu_down_maps_locked+0x10/0x20
			work_for_cpu_fn+0x15/0x20
			process_scheduled_works+0x2a7/0x500
			worker_thread+0x173/0x330
			kthread+0xe6/0x120
			ret_from_fork+0x2f/0x40
			ret_from_fork_asm+0x1b/0x30

	 other info that might help us debug this:

	 Chain exists of:
	   cpu_hotplug_lock --> rcu_state.barrier_mutex --> (work_completion)(&wfc.work)

	  Possible unsafe locking scenario:

			CPU0                    CPU1
			----                    ----
	   lock((work_completion)(&wfc.work));
									lock(rcu_state.barrier_mutex);
									lock((work_completion)(&wfc.work));
	   lock(cpu_hotplug_lock);

	  *** DEADLOCK ***

	 2 locks held by kworker/0:1/9:
	  #0: ffff900481068b38 ((wq_completion)events){+.+.}-{0:0}, at: process_scheduled_works+0x212/0x500
	  #1: ffff9e3bc0057e60 ((work_completion)(&wfc.work)){+.+.}-{0:0}, at: process_scheduled_works+0x216/0x500

	 stack backtrace:
	 CPU: 0 PID: 9 Comm: kworker/0:1 Not tainted 6.6.0-rc1-00065-g934ebd6e5359 #35409
	 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
	 Workqueue: events work_for_cpu_fn
	 Call Trace:
	 rcu-torture: rcu_torture_read_exit: Start of episode
	  <TASK>
	  dump_stack_lvl+0x4a/0x80
	  check_noncircular+0x132/0x150
	  __lock_acquire+0x1538/0x2500
	  lock_acquire+0xbf/0x2a0
	  ? _cpu_down+0x57/0x2b0
	  percpu_down_write+0x31/0x200
	  ? _cpu_down+0x57/0x2b0
	  _cpu_down+0x57/0x2b0
	  __cpu_down_maps_locked+0x10/0x20
	  work_for_cpu_fn+0x15/0x20
	  process_scheduled_works+0x2a7/0x500
	  worker_thread+0x173/0x330
	  ? __pfx_worker_thread+0x10/0x10
	  kthread+0xe6/0x120
	  ? __pfx_kthread+0x10/0x10
	  ret_from_fork+0x2f/0x40
	  ? __pfx_kthread+0x10/0x10
	  ret_from_fork_asm+0x1b/0x30
	  </TASK

Fix this with providing one lock class key per work_on_cpu() caller.

Reported-and-tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28 17:06:55 +00:00
..
acpi ACPI: sleep: Avoid breaking S3 wakeup due to might_sleep() 2023-06-28 11:12:22 +02:00
asm-generic word-at-a-time: use the same return type for has_zero regardless of endianness 2023-08-11 12:08:11 +02:00
clocksource
crypto crypto: api - Use work queue in crypto_destroy_instance 2023-09-13 09:42:32 +02:00
drm gpu/drm: Eliminate DRM_SCHED_PRIORITY_UNSET 2023-11-08 14:11:00 +01:00
dt-bindings dt-bindings: clock: qcom,gcc-sc8280xp: Add missing GDSCs 2023-09-13 09:42:45 +02:00
keys
kunit kunit: add macro to allow conditionally exposing static symbols to tests 2023-11-20 11:52:08 +01:00
kvm KVM: arm64: vgic-v4: Make the doorbell request robust w.r.t preemption 2023-08-23 17:52:28 +02:00
linux workqueue: Provide one lock class key per work_on_cpu() callsite 2023-11-28 17:06:55 +00:00
math-emu
media media: cec: core: add adap_unconfigured() callback 2023-09-13 09:42:54 +02:00
memory memory: renesas-rpc-if: Split-off private data from struct rpcif 2023-03-11 13:55:17 +01:00
misc
net wifi: cfg80211: fix kernel-doc for wiphy_delayed_work_flush() 2023-11-20 11:52:18 +01:00
pcmcia
ras
rdma RDMA/cma: Always set static rate to 0 for RoCE 2023-06-21 16:00:59 +02:00
rv
scsi scsi: sd: Introduce manage_shutdown device flag 2023-11-02 09:35:29 +01:00
soc net: mscc: ocelot: don't keep PTP configuration of all ports in single structure 2023-07-19 16:22:01 +02:00
sound ASoC: Intel: avs: Account for UID of ACPI device 2023-06-21 16:00:53 +02:00
target scsi: target: Fix multiple LUN_RESET handling 2023-05-11 23:03:19 +09:00
trace neighbor: tracing: Move pin6 inside CONFIG_IPV6=y section 2023-10-25 12:03:07 +02:00
uapi x86/sev: Change snp_guest_issue_request()'s fw_err argument 2023-11-20 11:52:13 +01:00
ufs scsi: ufs: exynos: Fix DMA alignment for PAGE_SIZE != 4096 2023-03-10 09:33:15 +01:00
vdso
video
xen ACPI: processor: Fix evaluating _PDC method when running as Xen dom0 2023-05-11 23:03:11 +09:00