Kernel Source and devicetree for NOTHING Phone(3a) and Phone(3a)Pro
Find a file
Frederic Weisbecker c817f5c016 rcu: Defer RCU kthreads wakeup when CPU is dying
[ Upstream commit e787644caf7628ad3269c1fbd321c3255cf51710 ]

When the CPU goes idle for the last time during the CPU down hotplug
process, RCU reports a final quiescent state for the current CPU. If
this quiescent state propagates up to the top, some tasks may then be
woken up to complete the grace period: the main grace period kthread
and/or the expedited main workqueue (or kworker).

If those kthreads have a SCHED_FIFO policy, the wake up can indirectly
arm the RT bandwith timer to the local offline CPU. Since this happens
after hrtimers have been migrated at CPUHP_AP_HRTIMERS_DYING stage, the
timer gets ignored. Therefore if the RCU kthreads are waiting for RT
bandwidth to be available, they may never be actually scheduled.

This triggers TREE03 rcutorture hangs:

	 rcu: INFO: rcu_preempt self-detected stall on CPU
	 rcu:     4-...!: (1 GPs behind) idle=9874/1/0x4000000000000000 softirq=0/0 fqs=20 rcuc=21071 jiffies(starved)
	 rcu:     (t=21035 jiffies g=938281 q=40787 ncpus=6)
	 rcu: rcu_preempt kthread starved for 20964 jiffies! g938281 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
	 rcu:     Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
	 rcu: RCU grace-period kthread stack dump:
	 task:rcu_preempt     state:R  running task     stack:14896 pid:14    tgid:14    ppid:2      flags:0x00004000
	 Call Trace:
	  <TASK>
	  __schedule+0x2eb/0xa80
	  schedule+0x1f/0x90
	  schedule_timeout+0x163/0x270
	  ? __pfx_process_timeout+0x10/0x10
	  rcu_gp_fqs_loop+0x37c/0x5b0
	  ? __pfx_rcu_gp_kthread+0x10/0x10
	  rcu_gp_kthread+0x17c/0x200
	  kthread+0xde/0x110
	  ? __pfx_kthread+0x10/0x10
	  ret_from_fork+0x2b/0x40
	  ? __pfx_kthread+0x10/0x10
	  ret_from_fork_asm+0x1b/0x30
	  </TASK>

The situation can't be solved with just unpinning the timer. The hrtimer
infrastructure and the nohz heuristics involved in finding the best
remote target for an unpinned timer would then also need to handle
enqueues from an offline CPU in the most horrendous way.

So fix this on the RCU side instead and defer the wake up to an online
CPU if it's too late for the local one.

Reported-by: Paul E. McKenney <paulmck@kernel.org>
Fixes: 5c0930ccaad5 ("hrtimers: Push pending hrtimers away from outgoing CPU earlier")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-31 16:17:05 -08:00
arch LoongArch/smp: Call rcutree_report_cpu_starting() earlier 2024-01-31 16:17:03 -08:00
block block: Remove special-casing of compound pages 2024-01-25 15:27:52 -08:00
certs certs: Fix build error when PKCS#11 URI contains semicolon 2023-02-09 11:28:11 +01:00
crypto crypto: api - Disallow identical driver names 2024-01-31 16:16:58 -08:00
Documentation rename(): fix the locking of subdirectories 2024-01-31 16:17:02 -08:00
drivers net/mlx5e: fix a potential double-free in fs_any_create_groups 2024-01-31 16:17:05 -08:00
fs afs: Hide silly-rename files from userspace 2024-01-31 16:17:04 -08:00
include udp: fix busy polling 2024-01-31 16:17:04 -08:00
init rootfs: Fix support for rootfstype= when root= is given 2024-01-25 15:27:43 -08:00
io_uring io_uring/rw: ensure io->bytes_done is always initialized 2024-01-25 15:27:41 -08:00
ipc ipc: fix memory leak in init_mqueue_fs() 2022-12-31 13:32:01 +01:00
kernel rcu: Defer RCU kthreads wakeup when CPU is dying 2024-01-31 16:17:05 -08:00
lib kunit: debugfs: Fix unchecked dereference in debugfs_print_results() 2024-01-25 15:27:21 -08:00
LICENSES LICENSES/LGPL-2.1: Add LGPL-2.1-or-later as valid identifiers 2021-12-16 14:33:10 +01:00
mm mm: page_alloc: unreserve highatomic page blocks before oom 2024-01-31 16:17:03 -08:00
net ipv6: init the accept_queue's spinlocks in inet6_create 2024-01-31 16:17:05 -08:00
rust rust: allocator: Prevent mis-aligned allocation 2023-08-11 12:08:18 +02:00
samples fprobe: Pass entry_data to handlers 2023-10-25 12:03:12 +02:00
scripts scripts/get_abi: fix source path leak 2024-01-31 16:17:01 -08:00
security lsm: new security_file_ioctl_compat() hook 2024-01-31 16:17:00 -08:00
sound ASoC: mediatek: sof-common: Add NULL check for normal_link string 2024-01-25 15:27:48 -08:00
tools selftests: mlxsw: qos_pfc: Adjust the test to support 8 lanes 2024-01-25 15:27:51 -08:00
usr usr/gen_init_cpio.c: remove unnecessary -1 values from int file 2022-10-03 14:21:44 -07:00
virt kvm/vfio: ensure kvg instance stays around in kvm_vfio_group_add() 2023-09-13 09:42:46 +02:00
.clang-format inet: ping: use hlist_nulls rcu iterator during lookup 2022-12-01 12:42:46 +01:00
.cocciconfig
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore Kbuild: add Rust support 2022-09-28 09:02:20 +02:00
.mailmap 9 hotfixes. 6 for MM, 3 for other areas. Four of these patches address 2022-12-10 17:10:52 -08:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Remove Michal Marek from Kbuild maintainers 2022-11-16 14:53:00 +09:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS genirq/affinity: Move group_cpus_evenly() into lib/ 2024-01-10 17:10:33 +01:00
Makefile Linux 6.1.75 2024-01-25 15:27:52 -08:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.