android_kernel_msm-6.1_noth.../kernel
Andrii Nakryiko 56675ddcb0 bpf: allow precision tracking for programs with subprogs
[ Upstream commit be2ef8161572ec1973124ebc50f56dafc2925e07 ]

Stop forcing precise=true for SCALAR registers when BPF program has any
subprograms. Current restriction means that any BPF program, as soon as
it uses subprograms, will end up not getting any of the precision
tracking benefits in reduction of number of verified states.

This patch keeps the fallback mark_all_scalars_precise() behavior if
precise marking has to cross function frames. E.g., if subprogram
requires R1 (first input arg) to be marked precise, ideally we'd need to
backtrack to the parent function and keep marking R1 and its
dependencies as precise. But right now we give up and force all the
SCALARs in any of the current and parent states to be forced to
precise=true. We can lift that restriction in the future.

But this patch fixes two issues identified when trying to enable
precision tracking for subprogs.

First, prevent "escaping" from top-most state in a global subprog. While
with entry-level BPF program we never end up requesting precision for
R1-R5 registers, because R2-R5 are not initialized (and so not readable
in correct BPF program), and R1 is PTR_TO_CTX, not SCALAR, and so is
implicitly precise. With global subprogs, though, it's different, as
global subprog a) can have up to 5 SCALAR input arguments, which might
get marked as precise=true and b) it is validated in isolation from its
main entry BPF program. b) means that we can end up exhausting parent
state chain and still not mark all registers in reg_mask as precise,
which would lead to verifier bug warning.

To handle that, we need to consider two cases. First, if the very first
state is not immediately "checkpointed" (i.e., stored in state lookup
hashtable), it will get correct first_insn_idx and last_insn_idx
instruction set during state checkpointing. As such, this case is
already handled and __mark_chain_precision() already handles that by
just doing nothing when we reach to the very first parent state.
st->parent will be NULL and we'll just stop. Perhaps some extra check
for reg_mask and stack_mask is due here, but this patch doesn't address
that issue.

More problematic second case is when global function's initial state is
immediately checkpointed before we manage to process the very first
instruction. This is happening because when there is a call to global
subprog from the main program the very first subprog's instruction is
marked as pruning point, so before we manage to process first
instruction we have to check and checkpoint state. This patch adds
a special handling for such "empty" state, which is identified by having
st->last_insn_idx set to -1. In such case, we check that we are indeed
validating global subprog, and with some sanity checking we mark input
args as precise if requested.

Note that we also initialize state->first_insn_idx with correct start
insn_idx offset. For main program zero is correct value, but for any
subprog it's quite confusing to not have first_insn_idx set. This
doesn't have any functional impact, but helps with debugging and state
printing. We also explicitly initialize state->last_insns_idx instead of
relying on is_state_visited() to do this with env->prev_insns_idx, which
will be -1 on the very first instruction. This concludes necessary
changes to handle specifically global subprog's precision tracking.

Second identified problem was missed handling of BPF helper functions
that call into subprogs (e.g., bpf_loop and few others). From precision
tracking and backtracking logic's standpoint those are effectively calls
into subprogs and should be called as BPF_PSEUDO_CALL calls.

This patch takes the least intrusive way and just checks against a short
list of current BPF helpers that do call subprogs, encapsulated in
is_callback_calling_function() function. But to prevent accidentally
forgetting to add new BPF helpers to this "list", we also do a sanity
check in __check_func_call, which has to be called for each such special
BPF helper, to validate that BPF helper is indeed recognized as
callback-calling one. This should catch any missed checks in the future.
Adding some special flags to be added in function proto definitions
seemed like an overkill in this case.

With the above changes, it's possible to remove forceful setting of
reg->precise to true in __mark_reg_unknown, which turns on precision
tracking both inside subprogs and entry progs that have subprogs. No
warnings or errors were detected across all the selftests, but also when
validating with veristat against internal Meta BPF objects and Cilium
objects. Further, in some BPF programs there are noticeable reduction in
number of states and instructions validated due to more effective
precision tracking, especially benefiting syncookie test.

$ ./veristat -C -e file,prog,insns,states ~/baseline-results.csv ~/subprog-precise-results.csv  | grep -v '+0'
File                                      Program                     Total insns (A)  Total insns (B)  Total insns (DIFF)  Total states (A)  Total states (B)  Total states (DIFF)
----------------------------------------  --------------------------  ---------------  ---------------  ------------------  ----------------  ----------------  -------------------
pyperf600_bpf_loop.bpf.linked1.o          on_event                               3966             3678       -288 (-7.26%)               306               276         -30 (-9.80%)
pyperf_global.bpf.linked1.o               on_event                               7563             7530        -33 (-0.44%)               520               517          -3 (-0.58%)
pyperf_subprogs.bpf.linked1.o             on_event                              36358            36934       +576 (+1.58%)              2499              2531         +32 (+1.28%)
setget_sockopt.bpf.linked1.o              skops_sockopt                          3965             4038        +73 (+1.84%)               343               347          +4 (+1.17%)
test_cls_redirect_subprogs.bpf.linked1.o  cls_redirect                          64965            64901        -64 (-0.10%)              4619              4612          -7 (-0.15%)
test_misc_tcp_hdr_options.bpf.linked1.o   misc_estab                             1491             1307      -184 (-12.34%)               110               100         -10 (-9.09%)
test_pkt_access.bpf.linked1.o             test_pkt_access                         354              349         -5 (-1.41%)                25                24          -1 (-4.00%)
test_sock_fields.bpf.linked1.o            egress_read_sock_fields                 435              375       -60 (-13.79%)                22                20          -2 (-9.09%)
test_sysctl_loop2.bpf.linked1.o           sysctl_tcp_mem                         1508             1501         -7 (-0.46%)                29                28          -1 (-3.45%)
test_tc_dtime.bpf.linked1.o               egress_fwdns_prio100                    468              435        -33 (-7.05%)                45                41          -4 (-8.89%)
test_tc_dtime.bpf.linked1.o               ingress_fwdns_prio100                   398              408        +10 (+2.51%)                42                39          -3 (-7.14%)
test_tc_dtime.bpf.linked1.o               ingress_fwdns_prio101                  1096              842      -254 (-23.18%)                97                73        -24 (-24.74%)
test_tcp_hdr_options.bpf.linked1.o        estab                                  2758             2408      -350 (-12.69%)               208               181        -27 (-12.98%)
test_urandom_usdt.bpf.linked1.o           urand_read_with_sema                    466              448        -18 (-3.86%)                31                28          -3 (-9.68%)
test_urandom_usdt.bpf.linked1.o           urand_read_without_sema                 466              448        -18 (-3.86%)                31                28          -3 (-9.68%)
test_urandom_usdt.bpf.linked1.o           urandlib_read_with_sema                 466              448        -18 (-3.86%)                31                28          -3 (-9.68%)
test_urandom_usdt.bpf.linked1.o           urandlib_read_without_sema              466              448        -18 (-3.86%)                31                28          -3 (-9.68%)
test_xdp_noinline.bpf.linked1.o           balancer_ingress_v6                    4302             4294         -8 (-0.19%)               257               256          -1 (-0.39%)
xdp_synproxy_kern.bpf.linked1.o           syncookie_tc                         583722           405757   -177965 (-30.49%)             35846             25735     -10111 (-28.21%)
xdp_synproxy_kern.bpf.linked1.o           syncookie_xdp                        609123           479055   -130068 (-21.35%)             35452             29145      -6307 (-17.79%)
----------------------------------------  --------------------------  ---------------  ---------------  ------------------  ----------------  ----------------  -------------------

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20221104163649.121784-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-27 08:50:50 +02:00
..
bpf bpf: allow precision tracking for programs with subprogs 2023-07-27 08:50:50 +02:00
cgroup sched/psi: use kernfs polling functions for PSI trigger polling 2023-07-27 08:50:38 +02:00
configs Kbuild: add Rust support 2022-09-28 09:02:20 +02:00
debug mm: remove vmacache 2022-09-26 19:46:18 -07:00
dma swiotlb: mark swiotlb_memblock_alloc() as __init 2023-07-23 13:49:50 +02:00
entry entry/rcu: Check TIF_RESCHED _after_ delayed RCU wake-up 2023-03-30 12:49:13 +02:00
events perf/core: Fix hardlockup failure caused by perf throttle 2023-05-11 23:03:31 +09:00
futex futex: Fix futex_waitv() hrtimer debug object leak on kcalloc error 2023-01-04 11:28:58 +01:00
gcov gcov: add support for checksum field 2022-12-31 13:33:11 +01:00
irq x86/pci/xen: populate MSI sysfs entries 2023-05-30 14:03:22 +01:00
kcsan kcsan: Don't expect 64 bits atomic builtins from 32 bits architectures 2023-07-19 16:21:37 +02:00
livepatch Livepatching changes for 6.1 2022-10-10 11:36:19 -07:00
locking locking/rwsem: Add __always_inline annotation to __down_read_common() and inlined callers 2023-05-17 11:53:57 +02:00
module module: Don't wait for GOING modules 2023-02-01 08:34:37 +01:00
power PM: QoS: Restore support for default value on frequency QoS 2023-07-23 13:49:45 +02:00
printk kernel/printk/index.c: fix memory leak with using debugfs_lookup() 2023-03-11 13:55:32 +01:00
rcu rcu: Mark additional concurrent load from ->cpu_no_qs.b.exp 2023-07-27 08:50:33 +02:00
sched sched/psi: use kernfs polling functions for PSI trigger polling 2023-07-27 08:50:38 +02:00
time tick/rcu: Fix bogus ratelimit condition 2023-07-19 16:20:59 +02:00
trace tracing/histograms: Return an error if we fail to add histogram to hist_vars list 2023-07-27 08:50:49 +02:00
.gitignore
acct.c acct: fix potential integer overflow in encode_comp_t() 2022-12-31 13:32:58 +01:00
async.c
audit.c audit: use time_after to compare time 2022-08-29 19:47:03 -04:00
audit.h audit: remove selinux_audit_rule_update() declaration 2022-09-07 11:30:15 -04:00
audit_fsnotify.c audit: fix potential double free on error path from fsnotify_add_inode_mark 2022-08-22 18:50:06 -04:00
audit_tree.c
audit_watch.c audit_init_parent(): constify path 2022-09-01 17:39:30 -04:00
auditfilter.c
auditsc.c audit/stable-6.1 PR 20221003 2022-10-04 11:05:43 -07:00
backtracetest.c
bounds.c mm: multi-gen LRU: minimal implementation 2022-09-26 19:46:09 -07:00
capability.c
cfi.c cfi: Switch to -fsanitize=kcfi 2022-09-26 10:13:13 -07:00
compat.c sched_getaffinity: don't assume 'cpumask_size()' is fully initialized 2023-04-06 12:10:40 +02:00
configs.c
context_tracking.c context_tracking: Fix noinstr vs KASAN 2023-03-10 09:33:45 +01:00
cpu.c cpu/hotplug: Do not bail-out in DYING/STARTING sections 2022-12-31 13:31:59 +01:00
cpu_pm.c context_tracking: Take IRQ eqs entrypoints over RCU 2022-07-05 13:32:59 -07:00
crash_core.c vmcoreinfo: add kallsyms_num_syms symbol 2022-08-28 14:02:44 -07:00
crash_dump.c
cred.c
delayacct.c delayacct: support re-entrance detection of thrashing accounting 2022-09-26 19:46:07 -07:00
dma.c
exec_domain.c
exit.c exit: Detect and fix irq disabled state in oops 2023-03-10 09:33:45 +01:00
extable.c context_tracking: Take NMI eqs entrypoints over RCU 2022-07-05 13:32:59 -07:00
fail_function.c kernel/fail_function: fix memory leak with using debugfs_lookup() 2023-03-11 13:55:39 +01:00
fork.c bpf: Fix UAF in task local storage 2023-06-14 11:15:16 +02:00
freezer.c freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
gen_kheaders.sh kbuild: build init/built-in.a just once 2022-09-29 04:40:15 +09:00
groups.c security: Add LSM hook to setgroups() syscall 2022-07-15 18:21:49 +00:00
hung_task.c sched: Fix more TASK_state comparisons 2022-09-30 16:50:39 +02:00
iomem.c
irq_work.c
jump_label.c jump_label: make initial NOP patching the special case 2022-06-24 09:48:55 +02:00
kallsyms.c kallsyms: strip LTO-only suffixes from promoted global functions 2023-07-27 08:50:39 +02:00
kallsyms_internal.h kallsyms: Improve the performance of kallsyms_lookup_name() 2023-07-27 08:50:39 +02:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kcov.c kcov: kmsan: unpoison area->list in kcov_remote_area_put() 2022-10-03 14:03:23 -07:00
kexec.c panic, kexec: make __crash_kexec() NMI safe 2022-09-11 21:55:06 -07:00
kexec_core.c kexec: fix a memory leak in crash_shrink_memory() 2023-07-19 16:21:08 +02:00
kexec_elf.c
kexec_file.c kexec: support purgatories with .text.hot sections 2023-06-21 16:00:55 +02:00
kexec_internal.h panic, kexec: make __crash_kexec() NMI safe 2022-09-11 21:55:06 -07:00
kheaders.c kheaders: Use array declaration instead of char 2023-05-11 23:03:02 +09:00
kmod.c
kprobes.c kprobes: Fix to handle forcibly unoptimized kprobes on freeing_list 2023-03-10 09:34:27 +01:00
ksysfs.c kexec: turn all kexec_mutex acquisitions into trylocks 2022-09-11 21:55:06 -07:00
kthread.c signal: break out of wait loops on kthread_stop() 2022-10-09 16:01:59 -07:00
latencytop.c latencytop: use the last element of latency_record of system 2022-09-11 21:55:12 -07:00
Makefile cfi: Fix CFI failure with KASAN 2022-12-31 13:33:08 +01:00
module_signature.c
notifier.c
nsproxy.c Revert "fs/exec: allow to unshare a time namespace on vfork+exec" 2022-09-13 10:38:43 -07:00
padata.c padata: Fix list iterator in padata_do_serial() 2022-12-31 13:32:34 +01:00
panic.c panic: fix the panic_print NMI backtrace setting 2023-03-10 09:34:25 +01:00
params.c
pid.c gfs2: Add glockfd debugfs file 2022-06-29 13:07:16 +02:00
pid_namespace.c rcu-tasks: Fix synchronize_rcu_tasks() VS zap_pid_ns_processes() 2023-03-10 09:32:52 +01:00
profile.c kernel/profile.c: simplify duplicated code in profile_setup() 2022-09-11 21:55:12 -07:00
ptrace.c freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
range.c
reboot.c kernel/reboot: Add SYS_OFF_MODE_RESTART_PREPARE mode 2022-10-04 15:59:36 +02:00
regset.c
relay.c relayfs: fix out-of-bounds access in relay_file_read 2023-05-11 23:03:03 +09:00
resource.c dax/kmem: Fix leak of memory-hotplug resources 2023-03-10 09:34:25 +01:00
resource_kunit.c
rseq.c rseq: Use pr_warn_once() when deprecated/unknown ABI flags are encountered 2022-11-14 09:58:32 +01:00
scftorture.c
scs.c
seccomp.c
signal.c Scheduler changes for v6.1: 2022-10-10 09:10:28 -07:00
smp.c bitmap patches for v6.1-rc1 2022-10-10 12:49:34 -07:00
smpboot.c smpboot: use atomic_try_cmpxchg in cpu_wait_death and cpu_report_death 2022-09-11 21:55:10 -07:00
smpboot.h
softirq.c context_tracking: Take IRQ eqs entrypoints over RCU 2022-07-05 13:32:59 -07:00
stackleak.c
stacktrace.c
static_call.c
static_call_inline.c
stop_machine.c
sys.c kernel/sys.c: fix and improve control flow in __sys_setres[ug]id() 2023-04-26 14:28:39 +02:00
sys_ni.c kernel/sys_ni: add compat entry for fadvise64_64 2022-08-20 15:17:45 -07:00
sysctl-test.c kernel/sysctl-test: use SYSCTL_{ZERO/ONE_HUNDRED} instead of i_{zero/one_hundred} 2022-09-08 16:56:45 -07:00
sysctl.c proc: proc_skip_spaces() shouldn't think it is working on C strings 2022-12-05 12:09:06 -08:00
task_work.c task_work: use try_cmpxchg in task_work_add, task_work_cancel_match and task_work_run 2022-09-11 21:55:10 -07:00
taskstats.c genetlink: start to validate reserved header bytes 2022-08-29 12:47:15 +01:00
torture.c torture: Fix hang during kthread shutdown phase 2023-03-10 09:34:07 +01:00
tracepoint.c tracepoint: Optimize the critical region of mutex_lock in tracepoint_module_coming() 2022-09-26 13:01:18 -04:00
tsacct.c
ucount.c
uid16.c
uid16.h
umh.c freezer,umh: Fix call_usermode_helper_exec() vs SIGKILL 2023-02-22 12:59:50 +01:00
up.c
user-return-notifier.c
user.c
user_namespace.c ucounts: Split rlimit and ucount values and max values 2022-10-09 16:24:05 -07:00
usermode_driver.c
utsname.c
utsname_sysctl.c kernel/utsname_sysctl.c: Fix hostname polling 2022-10-23 12:01:01 -07:00
watch_queue.c watch_queue: prevent dangling pipe pointer 2023-07-19 16:22:10 +02:00
watchdog.c powerpc updates for 6.0 2022-08-06 16:38:17 -07:00
watchdog_hld.c watchdog/perf: more properly prevent false positives with turbo modes 2023-07-19 16:21:08 +02:00
workqueue.c workqueue: clean up WORK_* constant types, clarify masking 2023-07-23 13:49:19 +02:00
workqueue_internal.h