android_kernel_msm-6.1_noth.../kernel/trace
Chen Lin 128c06a34c ring-buffer: Do not swap cpu_buffer during resize process
[ Upstream commit 8a96c0288d0737ad77882024974c075345c72011 ]

When ring_buffer_swap_cpu was called during resize process,
the cpu buffer was swapped in the middle, resulting in incorrect state.
Continuing to run in the wrong state will result in oops.

This issue can be easily reproduced using the following two scripts:
/tmp # cat test1.sh
//#! /bin/sh
for i in `seq 0 100000`
do
         echo 2000 > /sys/kernel/debug/tracing/buffer_size_kb
         sleep 0.5
         echo 5000 > /sys/kernel/debug/tracing/buffer_size_kb
         sleep 0.5
done
/tmp # cat test2.sh
//#! /bin/sh
for i in `seq 0 100000`
do
        echo irqsoff > /sys/kernel/debug/tracing/current_tracer
        sleep 1
        echo nop > /sys/kernel/debug/tracing/current_tracer
        sleep 1
done
/tmp # ./test1.sh &
/tmp # ./test2.sh &

A typical oops log is as follows, sometimes with other different oops logs.

[  231.711293] WARNING: CPU: 0 PID: 9 at kernel/trace/ring_buffer.c:2026 rb_update_pages+0x378/0x3f8
[  231.713375] Modules linked in:
[  231.714735] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G        W          6.5.0-rc1-00276-g20edcec23f92 #15
[  231.716750] Hardware name: linux,dummy-virt (DT)
[  231.718152] Workqueue: events update_pages_handler
[  231.719714] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  231.721171] pc : rb_update_pages+0x378/0x3f8
[  231.722212] lr : rb_update_pages+0x25c/0x3f8
[  231.723248] sp : ffff800082b9bd50
[  231.724169] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000
[  231.726102] x26: 0000000000000001 x25: fffffffffffff010 x24: 0000000000000ff0
[  231.728122] x23: ffff0000c3a0b600 x22: ffff0000c3a0b5c0 x21: fffffffffffffe0a
[  231.730203] x20: ffff0000c3a0b600 x19: ffff0000c0102400 x18: 0000000000000000
[  231.732329] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffe7aa8510
[  231.734212] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000002
[  231.736291] x11: ffff8000826998a8 x10: ffff800082b9baf0 x9 : ffff800081137558
[  231.738195] x8 : fffffc00030e82c8 x7 : 0000000000000000 x6 : 0000000000000001
[  231.740192] x5 : ffff0000ffbafe00 x4 : 0000000000000000 x3 : 0000000000000000
[  231.742118] x2 : 00000000000006aa x1 : 0000000000000001 x0 : ffff0000c0007208
[  231.744196] Call trace:
[  231.744892]  rb_update_pages+0x378/0x3f8
[  231.745893]  update_pages_handler+0x1c/0x38
[  231.746893]  process_one_work+0x1f0/0x468
[  231.747852]  worker_thread+0x54/0x410
[  231.748737]  kthread+0x124/0x138
[  231.749549]  ret_from_fork+0x10/0x20
[  231.750434] ---[ end trace 0000000000000000 ]---
[  233.720486] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[  233.721696] Mem abort info:
[  233.721935]   ESR = 0x0000000096000004
[  233.722283]   EC = 0x25: DABT (current EL), IL = 32 bits
[  233.722596]   SET = 0, FnV = 0
[  233.722805]   EA = 0, S1PTW = 0
[  233.723026]   FSC = 0x04: level 0 translation fault
[  233.723458] Data abort info:
[  233.723734]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[  233.724176]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[  233.724589]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[  233.725075] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000104943000
[  233.725592] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
[  233.726231] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[  233.726720] Modules linked in:
[  233.727007] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G        W          6.5.0-rc1-00276-g20edcec23f92 #15
[  233.727777] Hardware name: linux,dummy-virt (DT)
[  233.728225] Workqueue: events update_pages_handler
[  233.728655] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  233.729054] pc : rb_update_pages+0x1a8/0x3f8
[  233.729334] lr : rb_update_pages+0x154/0x3f8
[  233.729592] sp : ffff800082b9bd50
[  233.729792] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000
[  233.730220] x26: 0000000000000000 x25: ffff800082a8b840 x24: ffff0000c0102418
[  233.730653] x23: 0000000000000000 x22: fffffc000304c880 x21: 0000000000000003
[  233.731105] x20: 00000000000001f4 x19: ffff0000c0102400 x18: ffff800082fcbc58
[  233.731727] x17: 0000000000000000 x16: 0000000000000001 x15: 0000000000000001
[  233.732282] x14: ffff8000825fe0c8 x13: 0000000000000001 x12: 0000000000000000
[  233.732709] x11: ffff8000826998a8 x10: 0000000000000ae0 x9 : ffff8000801b760c
[  233.733148] x8 : fefefefefefefeff x7 : 0000000000000018 x6 : ffff0000c03298c0
[  233.733553] x5 : 0000000000000002 x4 : 0000000000000000 x3 : 0000000000000000
[  233.733972] x2 : ffff0000c3a0b600 x1 : 0000000000000000 x0 : 0000000000000000
[  233.734418] Call trace:
[  233.734593]  rb_update_pages+0x1a8/0x3f8
[  233.734853]  update_pages_handler+0x1c/0x38
[  233.735148]  process_one_work+0x1f0/0x468
[  233.735525]  worker_thread+0x54/0x410
[  233.735852]  kthread+0x124/0x138
[  233.736064]  ret_from_fork+0x10/0x20
[  233.736387] Code: 92400000 910006b5 aa000021 aa0303f7 (f9400060)
[  233.736959] ---[ end trace 0000000000000000 ]---

After analysis, the seq of the error is as follows [1-5]:

int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size,
			int cpu_id)
{
	for_each_buffer_cpu(buffer, cpu) {
		cpu_buffer = buffer->buffers[cpu];
		//1. get cpu_buffer, aka cpu_buffer(A)
		...
		...
		schedule_work_on(cpu,
		 &cpu_buffer->update_pages_work);
		//2. 'update_pages_work' is queue on 'cpu', cpu_buffer(A) is passed to
		// update_pages_handler, do the update process, set 'update_done' in
		// complete(&cpu_buffer->update_done) and to wakeup resize process.
	//---->
		//3. Just at this moment, ring_buffer_swap_cpu is triggered,
		//cpu_buffer(A) be swaped to cpu_buffer(B), the max_buffer.
		//ring_buffer_swap_cpu is called as the 'Call trace' below.

		Call trace:
		 dump_backtrace+0x0/0x2f8
		 show_stack+0x18/0x28
		 dump_stack+0x12c/0x188
		 ring_buffer_swap_cpu+0x2f8/0x328
		 update_max_tr_single+0x180/0x210
		 check_critical_timing+0x2b4/0x2c8
		 tracer_hardirqs_on+0x1c0/0x200
		 trace_hardirqs_on+0xec/0x378
		 el0_svc_common+0x64/0x260
		 do_el0_svc+0x90/0xf8
		 el0_svc+0x20/0x30
		 el0_sync_handler+0xb0/0xb8
		 el0_sync+0x180/0x1c0
	//<----

	/* wait for all the updates to complete */
	for_each_buffer_cpu(buffer, cpu) {
		cpu_buffer = buffer->buffers[cpu];
		//4. get cpu_buffer, cpu_buffer(B) is used in the following process,
		//the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong.
		//for example, cpu_buffer(A)->update_done will leave be set 1, and will
		//not 'wait_for_completion' at the next resize round.
		  if (!cpu_buffer->nr_pages_to_update)
			continue;

		if (cpu_online(cpu))
			wait_for_completion(&cpu_buffer->update_done);
		cpu_buffer->nr_pages_to_update = 0;
	}
	...
}
	//5. the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong,
	//Continuing to run in the wrong state, then oops occurs.

Link: https://lore.kernel.org/linux-trace-kernel/202307191558478409990@zte.com.cn

Signed-off-by: Chen Lin <chen.lin5@zte.com.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-23 17:52:27 +02:00
..
rv rv/monitor: Add __init/__exit annotations to module init/exit funcs 2022-09-26 18:10:51 -04:00
blktrace.c trace/blktrace: fix memory leak with using debugfs_lookup() 2023-03-10 09:33:46 +01:00
bpf_trace.c bpf: Disable preemption in bpf_event_output 2023-08-11 12:08:21 +02:00
bpf_trace.h
error_report-traces.c
fgraph.c arm64 fixes for 5.19-rc1: 2022-06-03 14:05:34 -07:00
fprobe.c fprobe: Ensure running fprobe_exit_handler() finished before calling rethook_free() 2023-07-23 13:49:46 +02:00
ftrace.c ftrace: Fix possible warning on checking all pages used in ftrace_process_locs() 2023-07-23 13:49:44 +02:00
ftrace_internal.h
Kconfig tracing: Fix complicated dependency of CONFIG_TRACER_MAX_TRACE 2023-01-07 11:11:48 +01:00
kprobe_event_gen_test.c tracing: Fix wrong return in kprobe_event_gen_test.c 2023-04-06 12:10:41 +02:00
Makefile rv: Add Runtime Verification (RV) interface 2022-07-30 14:01:28 -04:00
pid_list.c tracing: Cleanup double word in comment 2022-04-26 17:58:50 -04:00
pid_list.h
power-traces.c
preemptirq_delay_test.c
rethook.c fprobe: Ensure running fprobe_exit_handler() finished before calling rethook_free() 2023-07-23 13:49:46 +02:00
ring_buffer.c ring-buffer: Do not swap cpu_buffer during resize process 2023-08-23 17:52:27 +02:00
ring_buffer_benchmark.c
rpm-traces.c
synth_event_gen_test.c tracing: Fix memory leak in test_gen_synth_cmd() and test_empty_synth_event() 2022-11-17 17:51:38 -05:00
trace.c ring-buffer: Do not swap cpu_buffer during resize process 2023-08-23 17:52:27 +02:00
trace.h Revert "tracing: Add "(fault)" name injection to kernel probes" 2023-08-03 10:23:55 +02:00
trace_benchmark.c tracing: Add numeric delta time to the trace event benchmark 2022-09-26 13:01:09 -04:00
trace_benchmark.h tracing: Add numeric delta time to the trace event benchmark 2022-09-26 13:01:09 -04:00
trace_boot.c tracing: Initialize integer variable to prevent garbage return value 2022-05-26 21:13:00 -04:00
trace_branch.c
trace_clock.c
trace_dynevent.c tracing: Free buffers when a used dynamic event is removed 2022-11-23 19:07:12 -05:00
trace_dynevent.h
trace_entries.h
trace_eprobe.c kernel/trace: Fix cleanup logic of enable_trace_eprobe 2023-07-23 13:49:24 +02:00
trace_event_perf.c tracing/perf: Fix double put of trace event when init fails 2022-08-21 15:56:07 -04:00
trace_events.c tracing: Fix warning in trace_buffered_event_disable() 2023-08-03 10:24:07 +02:00
trace_events_filter.c tracing/filter: Call filter predicate functions directly via a switch statement 2022-09-26 13:01:10 -04:00
trace_events_filter_test.h
trace_events_hist.c tracing: Allow synthetic events to pass around stacktraces 2023-08-03 10:23:55 +02:00
trace_events_inject.c tracing: Support __rel_loc relative dynamic data location attribute 2021-12-06 15:37:21 -05:00
trace_events_synth.c tracing: Fix trace_event_raw_event_synth() if else statement 2023-08-03 10:24:17 +02:00
trace_events_trigger.c tracing: Fix to check event_mutex is held while accessing trigger list 2022-09-06 22:26:00 -04:00
trace_events_user.c tracing/user_events: Fix struct arg size match check 2023-07-23 13:49:47 +02:00
trace_export.c tracing: Fix TASK_COMM_LEN in trace event format file 2023-02-14 19:11:54 +01:00
trace_functions.c
trace_functions_graph.c
trace_hwlat.c tracing/hwlat: Replace sched_setaffinity with set_cpus_allowed_ptr 2023-03-30 12:48:59 +02:00
trace_irqsoff.c
trace_kdb.c
trace_kprobe.c tracing: Move duplicate code of trace_kprobe/eprobe.c into header 2022-10-12 13:50:00 -04:00
trace_kprobe_selftest.c
trace_kprobe_selftest.h
trace_mmiotrace.c
trace_nop.c
trace_osnoise.c tracing/timerlat: Always wakeup the timerlat thread 2023-06-09 10:34:24 +02:00
trace_output.c tracing: Make sure trace_printk() can output as soon as it can be used 2023-02-01 08:34:37 +01:00
trace_output.h
trace_preemptirq.c tracing: hold caller_addr to hardirq_{enable,disable}_ip 2022-09-06 22:26:00 -04:00
trace_printk.c
trace_probe.c Revert "tracing: Add "(fault)" name injection to kernel probes" 2023-08-03 10:23:55 +02:00
trace_probe.h tracing/probes: Add symstr type for dynamic events 2023-08-03 10:23:54 +02:00
trace_probe_kernel.h tracing/probes: Fix to record 0-length data_loc in fetch_store_string*() if fails 2023-08-03 10:23:55 +02:00
trace_probe_tmpl.h tracing/probes: Fix to record 0-length data_loc in fetch_store_string*() if fails 2023-08-03 10:23:55 +02:00
trace_recursion_record.c tracing: Use trace_create_file() to simplify creation of tracefs entries 2022-05-26 21:12:52 -04:00
trace_sched_switch.c sched/tracing: Append prev_state to tp args instead 2022-05-12 00:37:11 +02:00
trace_sched_wakeup.c sched/tracing: Append prev_state to tp args instead 2022-05-12 00:37:11 +02:00
trace_selftest.c tracing: Reset the function filter after completing trampoline/graph selftest 2022-05-25 16:57:37 -04:00
trace_selftest_dynamic.c
trace_seq.c
trace_stack.c
trace_stat.c
trace_stat.h
trace_synth.h tracing: Allow synthetic events to pass around stacktraces 2023-08-03 10:23:55 +02:00
trace_syscalls.c tracing: Remove unused __bad_type_size() method 2022-11-17 20:21:06 -05:00
trace_uprobe.c tracing/probes: Fix to record 0-length data_loc in fetch_store_string*() if fails 2023-08-03 10:23:55 +02:00
tracing_map.c tracing: Remove unused variable 'dups' 2022-10-03 12:20:31 -04:00
tracing_map.h