this hook will allow to account tick in every third of more ticks
to save cpu time for accounting
Bug: 279549765
Change-Id: I5d18e0167fdce076d13aecc653dcf6387bcb25f2
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
Add the vendor hook to freezer.c so that OEM's logic can be executed
when the process is about to be frozen. We need to clear the flag for
some tasks and rebind task dependencies for optimization purposes.
Bug: 187458531
Bug: 281920779
Signed-off-by: heshuai1 <heshuai1@xiaomi.com>
Change-Id: Iea42fd9604d6b33ccd6502425416f0dd28eecebb
(cherry picked from commit a1580311c36ca28344b2f03b3c8a72d9f8db5bde)
Exporting the symbol freezer_cgrp_subsys, in that vendor module can
add can_attach & cancel_attach member function. It is vendor-specific
tuning.
Bug: 182496370
Bug: 281920779
Signed-off-by: Zhuguangqing <zhuguangqing@xiaomi.com>
Change-Id: I153682b9d1015eed3f048b45ea6495ebb8f3c261
(cherry picked from commit ee3f4d2821f5b2a794f0a1f5ed423f561a01adae)
(cherry picked from commit 8a90e4d4e555dd5484213c6fec5061958016a194)
This hook will allow us to get signal messages so that we can set
limitations for certain tasks and restore them when receiving important signals.
Bug: 184898838
Bug: 281920779
Signed-off-by: Zhuguangqing <zhuguangqing@xiaomi.com>
Change-Id: I83a28b0a6eb413976f4c57f2314d008ad792fa0d
(cherry picked from commit 58e3f869fc3fe84fb7062496ccd049db47f3ed7f)
and sched_waking to let module probe them
Get task info about sleep and waking
Bug: 190422437
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
Change-Id: I828c93f531f84e6133c2c3a7f8faada51683afcf
(cherry picked from commit 13af062abf9b3586623adf7c3cdc5ced7bb5caed)
(cherry picked from commit 869954e72dac700580d0ea5734d07b574e41afe9)
(cherry picked from commit b7f527071c7716c1cc8dfea79353a0f00cec89d8)
Get task info about scheduling delay, iowait, and block time.
It is used to get thread scheduling info when thread happened abnormal situation.
Bug: 189415303
Change-Id: Ib6b548f8a78de5b26d555e9a89e3cc79ea2d1024
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
(cherry picked from commit a6bb1af39d11ef0360cb34bb31b7224ca4db031f)
(cherry picked from commit 6d8d2ab52facfd6d5de2715e2470872e6a70cf22)
(cherry picked from commit c3c29177681354b815763c3592e227b2288f341d)
Export symbol of the function wq_worker_comm() in kernel/workqueue.c for dlkm to get the description of the kworker process. It is used to get the description when kworker thread happened abnormal situation.
Bug: 208394207
Signed-off-by: zhengding chen <chenzhengding@oppo.com>
Change-Id: I2e7ddd52a15e22e99e6596f16be08243af1bb473
(cherry picked from commit 28de74186185e339123c86984729818d0d2d7f43)
(cherry picked from commit 87e0e98c25ba8e121975708943335e3abad651d9)
(cherry picked from commit 38a713dc80959bd855580a51b75654fb6fe9a5dd)
Apparently despite it being marked inline, the compiler
may not inline __down_read_common() which makes it difficult
to identify the cause of lock contention, as the blocked
function in traceevents will always be listed as
__down_read_common().
So this patch adds __always_inline annotation to the common
function (as well as the inlined helper callers) to force it to
be inlined so the blocking function will be listed (via Wchan)
in traceevents.
Fixes: c995e638cc ("locking/rwsem: Fold __down_{read,write}*()")
Reported-by: Tim Murray <timmurray@google.com>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Waiman Long <longman@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20230503023351.2832796-1-jstultz@google.com
Bug: 277817995
(cherry picked from commit 92cc5d00a431e96e5a49c0b97e5ad4fa7536bd4b
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git locking/urgent)
Signed-off-by: John Stultz <jstultz@google.com>
(cherry picked from https://android-review.googlesource.com/q/commit:a6c75b2e64573cb9f49f6b89808207856fc0309b)
Merged-In: Ifad7ed7fe9f2d5a9eb0cfe7c35e45c0e86bc3ad4
Change-Id: Ifad7ed7fe9f2d5a9eb0cfe7c35e45c0e86bc3ad4
This hook allows us to capture information when a process is forked so
that we can stat and set some key task's CPU affinity in the ko module
later. This patch, along with aosp/2548175, is necessary for our
affinity settings.
Bug: 183674818
Signed-off-by: lijianzhong <lijianzhong@xiaomi.com>
Change-Id: Ib93e05e5f6c338c5f7ada56bfebdd705f87f1f66
(cherry picked from commit a188361628461c58a4dfc72869d9acb1dfa2542f)
Exporting the symbol cpuset_cpus_allowed() so that we can adjust a
certain type of application's CPU affinity in vendor hooks according
to our tuning policy.
Related commit: aosp/2548175
Bug: 189725786
Signed-off-by: lijianzhong <lijianzhong@xiaomi.com>
Change-Id: I7919a893ab64bb441ab43cbb0b16825ed76d802d
(cherry picked from commit 5a7d01ed73e4fc812fda1d7288086dc73a283405)
Add vendor hooks for CPU affinity to support OEM's tuning policy, where
we can block or unblock a certain type of application's CPU affinity.
Bug: 183674818
Signed-off-by: lijianzhong <lijianzhong@xiaomi.com>
Change-Id: I3402abec4d9faa08f564409bfb8db8d7902f3aa2
(cherry picked from commit 7cf9646c245fdc63e2a3c9fad457c11fabdd2dfc)
We want to use this hook to record the sleeping time due to Futex
Bug: 210947226
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
Change-Id: I637f889dce42937116d10979e0c40fddf96cd1a2
(cherry picked from commit a7ab784f601a93a78c1c22cd0aacc2af64d8e3c8)
If an important task is going to sleep through do_futex(),
find out it's futex-owner by the pid comes from userspace,
and boost the owner by some means to shorten the sleep time.
How to boost? Depends on these hooks:
commit 53e809978443 ("ANDROID: vendor_hooks: Add hooks for scheduler")
Bug: 243110112
Signed-off-by: xieliujie <xieliujie@oppo.com>
Change-Id: I9a315cfb414fd34e0ef7a2cf9d57df50d4dd984f
(cherry picked from commit 548da5d23d98b796cf9a478675622a606b3307c8)
Add vendor hooks in add/update/remove frequency QoS request process to
ensure that we can access the OEM's "frequency watchdog" logic for
abnormal frequency monitoring. This is necessary for our power tuning
policy.
Bug: 187458531
Signed-off-by: heshuai1 <heshuai1@xiaomi.com>
Change-Id: I1fb8fd6134432ecfb44ad242c66ccd8280ab9b43
(cherry picked from commit c445fe4dc67ad74dacfa548bc78876a7ce057086)
Exporting the symbols find_user() & free_uid() to access user task
information in ko module for monitoring and optimization purposes. This
is a necessary component of our scheduling policy.
Bug: 183674818
Signed-off-by: lijianzhong <lijianzhong@xiaomi.com>
Change-Id: I12135c0af312904dd21b6f074beda086ad5ece98
(cherry picked from commit 16350016d8a3678da5012343ca00fa9918340c83)
(cherry picked from commit eec2cd3df3aa2d92136658d3619dc5142155c7d4)
In order to implement our scheduling tuning policy in certain cases, we
need to initialize the variables that we have defined in the
user_struct. To achieve this, we will add a vendor hook to user.c at
alloc_uid, which will ensure that our own logic is executed during the
initialization of the user_struct.
Bug: 187458531
Signed-off-by: heshuai1 <heshuai1@xiaomi.com>
Change-Id: I078484aac2c3d396aba5971d6d0f491652f3781c
(cherry picked from commit c9b8fa644f45e9c99da85d8947f6c7e06771f205)
(cherry picked from commit 9ac0923ef565e4de4e1f35edcba6fcb7e45948c9)
Add "cpufreq_policy" and "need_freq_update" parameters to the vendor
hook to enable frequency calculation in certain special cases related to
OEM's frequency tuning policy.
Bug: 183674818
Signed-off-by: lijianzhong <lijianzhong@xiaomi.com>
Change-Id: I232d2e1ae885d6736eca9e4709870f4272b4873d
clocksource driver may use sched_clock_register()
to resigter itself as a sched_clock source.
Export it to support building such driver
as module, like timer-mediatek.c
Link: https://lore.kernel.org/lkml/20230421034649.15247-5-walter.chang@mediatek.com/T/
Bug: 161675989
Change-Id: Ib052d1fd7ccf6a7422eb6f1755515e1236285e01
Signed-off-by: Chun-Hung Wu <chun-hung.wu@mediatek.com>
Add hooks to apply oem's optimization of rwsem and mutex
Bug: 182237112
Signed-off-by: xieliujie <xieliujie@oppo.com>
(cherry picked from commit 80b4341d0520dce2ef1e33aef06c6a2bc8ada518)
Signed-off-by: xieliujie <xieliujie@oppo.com>
Change-Id: I36895c432e5b6d6bff8781b4a7872badb693284c
Signed-off-by: Carlos Llamas <cmllamas@google.com>
[cmllamas: completes the cherry-pick of original commit 80b4341d0520
since commit 0902cc73b793 was only partial]
(cherry picked from commit d4528a28cb5be0c322031f333a6230fa3042931f)
These hooks help us do the following things:
a) Record the number of mutex and rwsem optimistic spin.
b) Monitor the time of mutex and rwsem optimistic spin.
c) Make it possible if oems don't want mutex and rwsem to optimistic spin
for a long time.
Bug: 267565260
Change-Id: I2bee30fb17946be85e026213b481aeaeaee2459f
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
(cherry picked from commit d01f7e1269d2651768a0bd91a88f76339a1cb427)
(cherry picked from commit 05b5ff11ad98c5896b352b4c376a84b63684e06c)
Providing vendor hooks to record the start time of holding the lock, which
protects rwsem/mutex locking-process from being preemptedfor a short time
in some cases.
- android_vh_record_mutex_lock_starttime
- android_vh_record_rtmutex_lock_starttime
- android_vh_record_rwsem_lock_starttime
- android_vh_record_pcpu_rwsem_starttime
Bug: 241191475
Signed-off-by: Peifeng Li <lipeifeng@oppo.com>
Change-Id: I0e967a1e8b77c32a1ad588acd54028fae2f90c4e
(cherry picked from commit f7294947672eb6b786f3c16b49e71e6a239101ad)
Current 500ms min window size for psi triggers limits polling interval
to 50ms to prevent polling threads from using too much cpu bandwidth by
polling too frequently. However the number of cgroups with triggers is
unlimited, so this protection can be defeated by creating multiple
cgroups with psi triggers (triggers in each cgroup are served by a single
"psimon" kernel thread).
Instead of limiting min polling period, which also limits the latency of
psi events, it's better to limit psi trigger creation to authorized users
only, like we do for system-wide psi triggers (/proc/pressure/* files can
be written only by processes with CAP_SYS_RESOURCE capability). This also
makes access rules for cgroup psi files consistent with system-wide ones.
Add a CAP_SYS_RESOURCE capability check for cgroup psi file writers and
remove the psi window min size limitation.
Bug: 269247660
Change-Id: I8876aa306cf2ba5acdf4daa5a5eff0665537bfeb
Suggested-by: Sudarshan Rajagopalan <quic_sudaraja@quicinc.com>
Link: https://lore.kernel.org/all/20230303011346.3342233-1-surenb@google.com/
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Sudarshan Rajagopalan <quic_sudaraja@quicinc.com>
Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com>
IOMMU_SYS_CACHE_NWA allows buffers for non-coherent devices to be
mapped with the correct memory attributes so that the buffers can be
cached in the system cache, with a no write allocate cache policy.
However, this property is only usable by drivers that invoke the IOMMU
API directly; it is not usable by drivers that use the DMA API.
Thus, introduce DMA_ATTR_SYS_CACHE_NWA, so that drivers for
non-coherent devices that use the DMA API can use it to specify if
they want a buffer to be cached in the system cache.
Bug: 189339242
Change-Id: Ic812a1fb144a58deb4279c2bf121fc6cc4c3b208
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com>
IOMMU_SYS_CACHE allows buffers for non-coherent devices to be mapped
with the correct memory attributes so that the buffers can be cached
in the system cache. However, this property is only usable by drivers
that invoke the IOMMU API directly; it is not usable by drivers that
use the DMA API.
Thus, introduce DMA_ATTR_SYS_CACHE, so that drivers for non-coherent
devices that use the DMA API can use it to specify if they want a
buffer to be cached in the system cache.
Bug: 189339242
Change-Id: I849d7a3f36b689afd2f6ee400507223fd6395158
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com>
Currently, the frequency is calculated by max freq * 1.25 * util / max cap.
Add a vendor hook to adjust the frequency when the calculation
overestimate.
android_vh_map_util_freq
adjust util to freq calculation
Bug: 177845439
Signed-off-by: Yun Hsiang <yun.hsiang@mediatek.com>
Change-Id: I9aa9079f00af7d3380b19f2fe21b75cddd107d15
(cherry picked from commit 3122e3ec9672036384304fdeaa1b1815f60ba817)
(cherry picked from commit a2d89d4f3a)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmRBFW0ACgkQONu9yGCS
aT7Jew//Ytw9+JQ71LT1TuJnQ1GayJOL1BW5hgxoYgnBFasWDwsGA9rzHs6KHqHb
0Vjk7MX7VZB+6zWakOxY5CFVM33J4fS7wY8WE2bj8X3QQhD/J0HQDMdELvSBi3qF
7xI6sghEQEwOuwAj2+CBqm/q7rA5FTnO1QgJuk/AKJ6PHGRiQeZ7q1zGpFvSaj7S
cyKvY99RsjnUN+PYk4LE2+u/6DVCqiWYVDQrdjalb9zsrXg4+nmPH6ZJzZX8+bbM
eM0xAR675V8TXqDi+8bj7tWmiS52XyjYF3Q/bu9BmU67DqslH9FFyVQxhgTHUZpN
qWXkojEU2djIc3qt7T/bpZS/vD8Kg3Px1CgyIRN8Y5SlZfhZyqVdTZ4AQCtJuLQJ
wDIdQCLlGzzDNFvbD+LdfJSjZt7Ig1sM/HwtPZhUA9yF0FN1XV3dcESzCOeI0/S7
ohRh8cs1sidnxrbvVwiVNENSqbJD7G9/9vVjIfyfcnt57q+fs6xCBhpOyNoVOp74
I5i6ALMcVZoAB50vDjnoGZsSRe9W2AmOV6UMIkVCvRCWYFqBpgVftMTAACNyljni
UlXmO7aDQj+nbHD/auclFtU02oHQbk62FSrwoWMFS090zWztQqUhgRY7Qnl13yCM
poEvrKlskXhvunsNtdVmI5O3N2GANWKgGwkyFIiXvgxKkw1qpUo=
=zeN9
-----END PGP SIGNATURE-----
Merge 6.1.25 into android14-6.1
Changes in 6.1.25
Revert "pinctrl: amd: Disable and mask interrupts on resume"
drm/amd/display: Pass the right info to drm_dp_remove_payload
ALSA: emu10k1: fix capture interrupt handler unlinking
ALSA: hda/sigmatel: add pin overrides for Intel DP45SG motherboard
ALSA: i2c/cs8427: fix iec958 mixer control deactivation
ALSA: hda: patch_realtek: add quirk for Asus N7601ZM
ALSA: hda/realtek: Add quirks for Lenovo Z13/Z16 Gen2
ALSA: firewire-tascam: add missing unwind goto in snd_tscm_stream_start_duplex()
ALSA: emu10k1: don't create old pass-through playback device on Audigy
ALSA: hda/sigmatel: fix S/PDIF out on Intel D*45* motherboards
ALSA: hda/hdmi: disable KAE for Intel DG2
Bluetooth: L2CAP: Fix use-after-free in l2cap_disconnect_{req,rsp}
Bluetooth: Fix race condition in hidp_session_thread
bluetooth: btbcm: Fix logic error in forming the board name.
Bluetooth: Free potentially unfreed SCO connection
Bluetooth: hci_conn: Fix possible UAF
btrfs: restore the thread_pool= behavior in remount for the end I/O workqueues
btrfs: fix fast csum implementation detection
fbmem: Reject FB_ACTIVATE_KD_TEXT from userspace
mtdblock: tolerate corrected bit-flips
mtd: rawnand: meson: fix bitmask for length in command word
mtd: rawnand: stm32_fmc2: remove unsupported EDO mode
mtd: rawnand: stm32_fmc2: use timings.mode instead of checking tRC_min
KVM: arm64: PMU: Restore the guest's EL0 event counting after migration
fbcon: Fix error paths in set_con2fb_map
fbcon: set_con2fb_map needs to set con2fb_map!
drm/i915/dsi: fix DSS CTL register offsets for TGL+
clk: sprd: set max_register according to mapping range
RDMA/irdma: Do not generate SW completions for NOPs
RDMA/irdma: Fix memory leak of PBLE objects
RDMA/irdma: Increase iWARP CM default rexmit count
RDMA/irdma: Add ipv4 check to irdma_find_listener()
IB/mlx5: Add support for 400G_8X lane speed
RDMA/erdma: Update default EQ depth to 4096 and max_send_wr to 8192
RDMA/erdma: Inline mtt entries into WQE if supported
RDMA/erdma: Defer probing if netdevice can not be found
clk: rs9: Fix suspend/resume
RDMA/cma: Allow UD qp_type to join multicast only
bpf: tcp: Use sock_gen_put instead of sock_put in bpf_iter_tcp
LoongArch, bpf: Fix jit to skip speculation barrier opcode
dmaengine: apple-admac: Handle 'global' interrupt flags
dmaengine: apple-admac: Set src_addr_widths capability
dmaengine: apple-admac: Fix 'current_tx' not getting freed
9p/xen : Fix use after free bug in xen_9pfs_front_remove due to race condition
bpf, arm64: Fixed a BTI error on returning to patched function
KVM: arm64: Initialise hypervisor copies of host symbols unconditionally
KVM: arm64: Advertise ID_AA64PFR0_EL1.CSV2/3 to protected VMs
niu: Fix missing unwind goto in niu_alloc_channels()
tcp: restrict net.ipv4.tcp_app_win
bonding: fix ns validation on backup slaves
iavf: refactor VLAN filter states
iavf: remove active_cvlans and active_svlans bitmaps
net: openvswitch: fix race on port output
Bluetooth: hci_conn: Fix not cleaning up on LE Connection failure
Bluetooth: Fix printing errors if LE Connection times out
Bluetooth: SCO: Fix possible circular locking dependency sco_sock_getsockopt
Bluetooth: Set ISO Data Path on broadcast sink
drm/armada: Fix a potential double free in an error handling path
qlcnic: check pci_reset_function result
net: wwan: iosm: Fix error handling path in ipc_pcie_probe()
cgroup,freezer: hold cpu_hotplug_lock before freezer_mutex
net: qrtr: Fix an uninit variable access bug in qrtr_tx_resume()
sctp: fix a potential overflow in sctp_ifwdtsn_skip
RDMA/core: Fix GID entry ref leak when create_ah fails
selftests: openvswitch: adjust datapath NL message declaration
udp6: fix potential access to stale information
net: macb: fix a memory corruption in extended buffer descriptor mode
skbuff: Fix a race between coalescing and releasing SKBs
libbpf: Fix single-line struct definition output in btf_dump
ARM: 9290/1: uaccess: Fix KASAN false-positives
ARM: dts: qcom: apq8026-lg-lenok: add missing reserved memory
power: supply: rk817: Fix unsigned comparison with less than zero
power: supply: cros_usbpd: reclassify "default case!" as debug
power: supply: axp288_fuel_gauge: Added check for negative values
selftests/bpf: Fix progs/find_vma_fail1.c build error.
wifi: mwifiex: mark OF related data as maybe unused
i2c: imx-lpi2c: clean rx/tx buffers upon new message
i2c: hisi: Avoid redundant interrupts
efi: sysfb_efi: Add quirk for Lenovo Yoga Book X91F/L
block: ublk_drv: mark device as LIVE before adding disk
ACPI: video: Add backlight=native DMI quirk for Acer Aspire 3830TG
drm: panel-orientation-quirks: Add quirk for Lenovo Yoga Book X90F
hwmon: (peci/cputemp) Fix miscalculated DTS for SKX
hwmon: (xgene) Fix ioremap and memremap leak
verify_pefile: relax wrapper length check
asymmetric_keys: log on fatal failures in PE/pkcs7
nvme: send Identify with CNS 06h only to I/O controllers
wifi: iwlwifi: mvm: fix mvmtxq->stopped handling
wifi: iwlwifi: mvm: protect TXQ list manipulation
drm/amdgpu: add mes resume when do gfx post soft reset
drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs
drm/amdgpu/gfx: set cg flags to enter/exit safe mode
ACPI: resource: Add Medion S17413 to IRQ override quirk
x86/hyperv: Move VMCB enlightenment definitions to hyperv-tlfs.h
KVM: selftests: Move "struct hv_enlightenments" to x86_64/svm.h
KVM: SVM: Add a proper field for Hyper-V VMCB enlightenments
x86/hyperv: KVM: Rename "hv_enlightenments" to "hv_vmcb_enlightenments"
KVM: SVM: Flush Hyper-V TLB when required
tracing: Add trace_array_puts() to write into instance
tracing: Have tracing_snapshot_instance_cond() write errors to the appropriate instance
maple_tree: fix write memory barrier of nodes once dead for RCU mode
ksmbd: avoid out of bounds access in decode_preauth_ctxt()
riscv: add icache flush for nommu sigreturn trampoline
HID: intel-ish-hid: Fix kernel panic during warm reset
net: sfp: initialize sfp->i2c_block_size at sfp allocation
net: phy: nxp-c45-tja11xx: add remove callback
net: phy: nxp-c45-tja11xx: fix unsigned long multiplication overflow
scsi: ses: Handle enclosure with just a primary component gracefully
x86/PCI: Add quirk for AMD XHCI controller that loses MSI-X state in D3hot
cgroup: fix display of forceidle time at root
cgroup/cpuset: Fix partition root's cpuset.cpus update bug
cgroup/cpuset: Wake up cpuset_attach_wq tasks in cpuset_cancel_attach()
drm/amd/pm: correct SMU13.0.7 pstate profiling clock settings
drm/amd/pm: correct SMU13.0.7 max shader clock reporting
mptcp: use mptcp_schedule_work instead of open-coding it
mptcp: stricter state check in mptcp_worker
ubi: Fix failure attaching when vid_hdr offset equals to (sub)page size
ubi: Fix deadlock caused by recursively holding work_sem
i2c: mchp-pci1xxxx: Update Timing registers
powerpc/papr_scm: Update the NUMA distance table for the target node
sched/fair: Fix imbalance overflow
x86/rtc: Remove __init for runtime functions
i2c: ocores: generate stop condition after timeout in polling mode
cifs: fix negotiate context parsing
nvme-pci: mark Lexar NM760 as IGNORE_DEV_SUBNQN
nvme-pci: add NVME_QUIRK_BOGUS_NID for T-FORCE Z330 SSD
cgroup/cpuset: Skip spread flags update on v2
cgroup/cpuset: Make cpuset_fork() handle CLONE_INTO_CGROUP properly
cgroup/cpuset: Add cpuset_can_fork() and cpuset_cancel_fork() methods
Linux 6.1.25
Change-Id: Ib4d2c49ea9bacb8d8dbdb7b3a4eecce937016427
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
commit 659c0ce1cb9efc7f58d380ca4bb2a51ae9e30553 upstream.
Linux Security Modules (LSMs) that implement the "capable" hook will
usually emit an access denial message to the audit log whenever they
"block" the current task from using the given capability based on their
security policy.
The occurrence of a denial is used as an indication that the given task
has attempted an operation that requires the given access permission, so
the callers of functions that perform LSM permission checks must take care
to avoid calling them too early (before it is decided if the permission is
actually needed to perform the requested operation).
The __sys_setres[ug]id() functions violate this convention by first
calling ns_capable_setid() and only then checking if the operation
requires the capability or not. It means that any caller that has the
capability granted by DAC (task's capability set) but not by MAC (LSMs)
will generate a "denied" audit record, even if is doing an operation for
which the capability is not required.
Fix this by reordering the checks such that ns_capable_setid() is checked
last and -EPERM is returned immediately if it returns false.
While there, also do two small optimizations:
* move the capability check before prepare_creds() and
* bail out early in case of a no-op.
Link: https://lkml.kernel.org/r/20230217162154.837549-1-omosnace@redhat.com
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 71b547f561247897a0a14f3082730156c0533fed ]
Juan Jose et al reported an issue found via fuzzing where the verifier's
pruning logic prematurely marks a program path as safe.
Consider the following program:
0: (b7) r6 = 1024
1: (b7) r7 = 0
2: (b7) r8 = 0
3: (b7) r9 = -2147483648
4: (97) r6 %= 1025
5: (05) goto pc+0
6: (bd) if r6 <= r9 goto pc+2
7: (97) r6 %= 1
8: (b7) r9 = 0
9: (bd) if r6 <= r9 goto pc+1
10: (b7) r6 = 0
11: (b7) r0 = 0
12: (63) *(u32 *)(r10 -4) = r0
13: (18) r4 = 0xffff888103693400 // map_ptr(ks=4,vs=48)
15: (bf) r1 = r4
16: (bf) r2 = r10
17: (07) r2 += -4
18: (85) call bpf_map_lookup_elem#1
19: (55) if r0 != 0x0 goto pc+1
20: (95) exit
21: (77) r6 >>= 10
22: (27) r6 *= 8192
23: (bf) r1 = r0
24: (0f) r0 += r6
25: (79) r3 = *(u64 *)(r0 +0)
26: (7b) *(u64 *)(r1 +0) = r3
27: (95) exit
The verifier treats this as safe, leading to oob read/write access due
to an incorrect verifier conclusion:
func#0 @0
0: R1=ctx(off=0,imm=0) R10=fp0
0: (b7) r6 = 1024 ; R6_w=1024
1: (b7) r7 = 0 ; R7_w=0
2: (b7) r8 = 0 ; R8_w=0
3: (b7) r9 = -2147483648 ; R9_w=-2147483648
4: (97) r6 %= 1025 ; R6_w=scalar()
5: (05) goto pc+0
6: (bd) if r6 <= r9 goto pc+2 ; R6_w=scalar(umin=18446744071562067969,var_off=(0xffffffff00000000; 0xffffffff)) R9_w=-2147483648
7: (97) r6 %= 1 ; R6_w=scalar()
8: (b7) r9 = 0 ; R9=0
9: (bd) if r6 <= r9 goto pc+1 ; R6=scalar(umin=1) R9=0
10: (b7) r6 = 0 ; R6_w=0
11: (b7) r0 = 0 ; R0_w=0
12: (63) *(u32 *)(r10 -4) = r0
last_idx 12 first_idx 9
regs=1 stack=0 before 11: (b7) r0 = 0
13: R0_w=0 R10=fp0 fp-8=0000????
13: (18) r4 = 0xffff8ad3886c2a00 ; R4_w=map_ptr(off=0,ks=4,vs=48,imm=0)
15: (bf) r1 = r4 ; R1_w=map_ptr(off=0,ks=4,vs=48,imm=0) R4_w=map_ptr(off=0,ks=4,vs=48,imm=0)
16: (bf) r2 = r10 ; R2_w=fp0 R10=fp0
17: (07) r2 += -4 ; R2_w=fp-4
18: (85) call bpf_map_lookup_elem#1 ; R0=map_value_or_null(id=1,off=0,ks=4,vs=48,imm=0)
19: (55) if r0 != 0x0 goto pc+1 ; R0=0
20: (95) exit
from 19 to 21: R0=map_value(off=0,ks=4,vs=48,imm=0) R6=0 R7=0 R8=0 R9=0 R10=fp0 fp-8=mmmm????
21: (77) r6 >>= 10 ; R6_w=0
22: (27) r6 *= 8192 ; R6_w=0
23: (bf) r1 = r0 ; R0=map_value(off=0,ks=4,vs=48,imm=0) R1_w=map_value(off=0,ks=4,vs=48,imm=0)
24: (0f) r0 += r6
last_idx 24 first_idx 19
regs=40 stack=0 before 23: (bf) r1 = r0
regs=40 stack=0 before 22: (27) r6 *= 8192
regs=40 stack=0 before 21: (77) r6 >>= 10
regs=40 stack=0 before 19: (55) if r0 != 0x0 goto pc+1
parent didn't have regs=40 stack=0 marks: R0_rw=map_value_or_null(id=1,off=0,ks=4,vs=48,imm=0) R6_rw=P0 R7=0 R8=0 R9=0 R10=fp0 fp-8=mmmm????
last_idx 18 first_idx 9
regs=40 stack=0 before 18: (85) call bpf_map_lookup_elem#1
regs=40 stack=0 before 17: (07) r2 += -4
regs=40 stack=0 before 16: (bf) r2 = r10
regs=40 stack=0 before 15: (bf) r1 = r4
regs=40 stack=0 before 13: (18) r4 = 0xffff8ad3886c2a00
regs=40 stack=0 before 12: (63) *(u32 *)(r10 -4) = r0
regs=40 stack=0 before 11: (b7) r0 = 0
regs=40 stack=0 before 10: (b7) r6 = 0
25: (79) r3 = *(u64 *)(r0 +0) ; R0_w=map_value(off=0,ks=4,vs=48,imm=0) R3_w=scalar()
26: (7b) *(u64 *)(r1 +0) = r3 ; R1_w=map_value(off=0,ks=4,vs=48,imm=0) R3_w=scalar()
27: (95) exit
from 9 to 11: R1=ctx(off=0,imm=0) R6=0 R7=0 R8=0 R9=0 R10=fp0
11: (b7) r0 = 0 ; R0_w=0
12: (63) *(u32 *)(r10 -4) = r0
last_idx 12 first_idx 11
regs=1 stack=0 before 11: (b7) r0 = 0
13: R0_w=0 R10=fp0 fp-8=0000????
13: (18) r4 = 0xffff8ad3886c2a00 ; R4_w=map_ptr(off=0,ks=4,vs=48,imm=0)
15: (bf) r1 = r4 ; R1_w=map_ptr(off=0,ks=4,vs=48,imm=0) R4_w=map_ptr(off=0,ks=4,vs=48,imm=0)
16: (bf) r2 = r10 ; R2_w=fp0 R10=fp0
17: (07) r2 += -4 ; R2_w=fp-4
18: (85) call bpf_map_lookup_elem#1
frame 0: propagating r6
last_idx 19 first_idx 11
regs=40 stack=0 before 18: (85) call bpf_map_lookup_elem#1
regs=40 stack=0 before 17: (07) r2 += -4
regs=40 stack=0 before 16: (bf) r2 = r10
regs=40 stack=0 before 15: (bf) r1 = r4
regs=40 stack=0 before 13: (18) r4 = 0xffff8ad3886c2a00
regs=40 stack=0 before 12: (63) *(u32 *)(r10 -4) = r0
regs=40 stack=0 before 11: (b7) r0 = 0
parent didn't have regs=40 stack=0 marks: R1=ctx(off=0,imm=0) R6_r=P0 R7=0 R8=0 R9=0 R10=fp0
last_idx 9 first_idx 9
regs=40 stack=0 before 9: (bd) if r6 <= r9 goto pc+1
parent didn't have regs=40 stack=0 marks: R1=ctx(off=0,imm=0) R6_rw=Pscalar() R7_w=0 R8_w=0 R9_rw=0 R10=fp0
last_idx 8 first_idx 0
regs=40 stack=0 before 8: (b7) r9 = 0
regs=40 stack=0 before 7: (97) r6 %= 1
regs=40 stack=0 before 6: (bd) if r6 <= r9 goto pc+2
regs=40 stack=0 before 5: (05) goto pc+0
regs=40 stack=0 before 4: (97) r6 %= 1025
regs=40 stack=0 before 3: (b7) r9 = -2147483648
regs=40 stack=0 before 2: (b7) r8 = 0
regs=40 stack=0 before 1: (b7) r7 = 0
regs=40 stack=0 before 0: (b7) r6 = 1024
19: safe
frame 0: propagating r6
last_idx 9 first_idx 0
regs=40 stack=0 before 6: (bd) if r6 <= r9 goto pc+2
regs=40 stack=0 before 5: (05) goto pc+0
regs=40 stack=0 before 4: (97) r6 %= 1025
regs=40 stack=0 before 3: (b7) r9 = -2147483648
regs=40 stack=0 before 2: (b7) r8 = 0
regs=40 stack=0 before 1: (b7) r7 = 0
regs=40 stack=0 before 0: (b7) r6 = 1024
from 6 to 9: safe
verification time 110 usec
stack depth 4
processed 36 insns (limit 1000000) max_states_per_insn 0 total_states 3 peak_states 3 mark_read 2
The verifier considers this program as safe by mistakenly pruning unsafe
code paths. In the above func#0, code lines 0-10 are of interest. In line
0-3 registers r6 to r9 are initialized with known scalar values. In line 4
the register r6 is reset to an unknown scalar given the verifier does not
track modulo operations. Due to this, the verifier can also not determine
precisely which branches in line 6 and 9 are taken, therefore it needs to
explore them both.
As can be seen, the verifier starts with exploring the false/fall-through
paths first. The 'from 19 to 21' path has both r6=0 and r9=0 and the pointer
arithmetic on r0 += r6 is therefore considered safe. Given the arithmetic,
r6 is correctly marked for precision tracking where backtracking kicks in
where it walks back the current path all the way where r6 was set to 0 in
the fall-through branch.
Next, the pruning logics pops the path 'from 9 to 11' from the stack. Also
here, the state of the registers is the same, that is, r6=0 and r9=0, so
that at line 19 the path can be pruned as it is considered safe. It is
interesting to note that the conditional in line 9 turned r6 into a more
precise state, that is, in the fall-through path at the beginning of line
10, it is R6=scalar(umin=1), and in the branch-taken path (which is analyzed
here) at the beginning of line 11, r6 turned into a known const r6=0 as
r9=0 prior to that and therefore (unsigned) r6 <= 0 concludes that r6 must
be 0 (**):
[...] ; R6_w=scalar()
9: (bd) if r6 <= r9 goto pc+1 ; R6=scalar(umin=1) R9=0
[...]
from 9 to 11: R1=ctx(off=0,imm=0) R6=0 R7=0 R8=0 R9=0 R10=fp0
[...]
The next path is 'from 6 to 9'. The verifier considers the old and current
state equivalent, and therefore prunes the search incorrectly. Looking into
the two states which are being compared by the pruning logic at line 9, the
old state consists of R6_rwD=Pscalar() R9_rwD=0 R10=fp0 and the new state
consists of R1=ctx(off=0,imm=0) R6_w=scalar(umax=18446744071562067968)
R7_w=0 R8_w=0 R9_w=-2147483648 R10=fp0. While r6 had the reg->precise flag
correctly set in the old state, r9 did not. Both r6'es are considered as
equivalent given the old one is a superset of the current, more precise one,
however, r9's actual values (0 vs 0x80000000) mismatch. Given the old r9
did not have reg->precise flag set, the verifier does not consider the
register as contributing to the precision state of r6, and therefore it
considered both r9 states as equivalent. However, for this specific pruned
path (which is also the actual path taken at runtime), register r6 will be
0x400 and r9 0x80000000 when reaching line 21, thus oob-accessing the map.
The purpose of precision tracking is to initially mark registers (including
spilled ones) as imprecise to help verifier's pruning logic finding equivalent
states it can then prune if they don't contribute to the program's safety
aspects. For example, if registers are used for pointer arithmetic or to pass
constant length to a helper, then the verifier sets reg->precise flag and
backtracks the BPF program instruction sequence and chain of verifier states
to ensure that the given register or stack slot including their dependencies
are marked as precisely tracked scalar. This also includes any other registers
and slots that contribute to a tracked state of given registers/stack slot.
This backtracking relies on recorded jmp_history and is able to traverse
entire chain of parent states. This process ends only when all the necessary
registers/slots and their transitive dependencies are marked as precise.
The backtrack_insn() is called from the current instruction up to the first
instruction, and its purpose is to compute a bitmask of registers and stack
slots that need precision tracking in the parent's verifier state. For example,
if a current instruction is r6 = r7, then r6 needs precision after this
instruction and r7 needs precision before this instruction, that is, in the
parent state. Hence for the latter r7 is marked and r6 unmarked.
For the class of jmp/jmp32 instructions, backtrack_insn() today only looks
at call and exit instructions and for all other conditionals the masks
remain as-is. However, in the given situation register r6 has a dependency
on r9 (as described above in **), so also that one needs to be marked for
precision tracking. In other words, if an imprecise register influences a
precise one, then the imprecise register should also be marked precise.
Meaning, in the parent state both dest and src register need to be tracked
for precision and therefore the marking must be more conservative by setting
reg->precise flag for both. The precision propagation needs to cover both
for the conditional: if the src reg was marked but not the dst reg and vice
versa.
After the fix the program is correctly rejected:
func#0 @0
0: R1=ctx(off=0,imm=0) R10=fp0
0: (b7) r6 = 1024 ; R6_w=1024
1: (b7) r7 = 0 ; R7_w=0
2: (b7) r8 = 0 ; R8_w=0
3: (b7) r9 = -2147483648 ; R9_w=-2147483648
4: (97) r6 %= 1025 ; R6_w=scalar()
5: (05) goto pc+0
6: (bd) if r6 <= r9 goto pc+2 ; R6_w=scalar(umin=18446744071562067969,var_off=(0xffffffff80000000; 0x7fffffff),u32_min=-2147483648) R9_w=-2147483648
7: (97) r6 %= 1 ; R6_w=scalar()
8: (b7) r9 = 0 ; R9=0
9: (bd) if r6 <= r9 goto pc+1 ; R6=scalar(umin=1) R9=0
10: (b7) r6 = 0 ; R6_w=0
11: (b7) r0 = 0 ; R0_w=0
12: (63) *(u32 *)(r10 -4) = r0
last_idx 12 first_idx 9
regs=1 stack=0 before 11: (b7) r0 = 0
13: R0_w=0 R10=fp0 fp-8=0000????
13: (18) r4 = 0xffff9290dc5bfe00 ; R4_w=map_ptr(off=0,ks=4,vs=48,imm=0)
15: (bf) r1 = r4 ; R1_w=map_ptr(off=0,ks=4,vs=48,imm=0) R4_w=map_ptr(off=0,ks=4,vs=48,imm=0)
16: (bf) r2 = r10 ; R2_w=fp0 R10=fp0
17: (07) r2 += -4 ; R2_w=fp-4
18: (85) call bpf_map_lookup_elem#1 ; R0=map_value_or_null(id=1,off=0,ks=4,vs=48,imm=0)
19: (55) if r0 != 0x0 goto pc+1 ; R0=0
20: (95) exit
from 19 to 21: R0=map_value(off=0,ks=4,vs=48,imm=0) R6=0 R7=0 R8=0 R9=0 R10=fp0 fp-8=mmmm????
21: (77) r6 >>= 10 ; R6_w=0
22: (27) r6 *= 8192 ; R6_w=0
23: (bf) r1 = r0 ; R0=map_value(off=0,ks=4,vs=48,imm=0) R1_w=map_value(off=0,ks=4,vs=48,imm=0)
24: (0f) r0 += r6
last_idx 24 first_idx 19
regs=40 stack=0 before 23: (bf) r1 = r0
regs=40 stack=0 before 22: (27) r6 *= 8192
regs=40 stack=0 before 21: (77) r6 >>= 10
regs=40 stack=0 before 19: (55) if r0 != 0x0 goto pc+1
parent didn't have regs=40 stack=0 marks: R0_rw=map_value_or_null(id=1,off=0,ks=4,vs=48,imm=0) R6_rw=P0 R7=0 R8=0 R9=0 R10=fp0 fp-8=mmmm????
last_idx 18 first_idx 9
regs=40 stack=0 before 18: (85) call bpf_map_lookup_elem#1
regs=40 stack=0 before 17: (07) r2 += -4
regs=40 stack=0 before 16: (bf) r2 = r10
regs=40 stack=0 before 15: (bf) r1 = r4
regs=40 stack=0 before 13: (18) r4 = 0xffff9290dc5bfe00
regs=40 stack=0 before 12: (63) *(u32 *)(r10 -4) = r0
regs=40 stack=0 before 11: (b7) r0 = 0
regs=40 stack=0 before 10: (b7) r6 = 0
25: (79) r3 = *(u64 *)(r0 +0) ; R0_w=map_value(off=0,ks=4,vs=48,imm=0) R3_w=scalar()
26: (7b) *(u64 *)(r1 +0) = r3 ; R1_w=map_value(off=0,ks=4,vs=48,imm=0) R3_w=scalar()
27: (95) exit
from 9 to 11: R1=ctx(off=0,imm=0) R6=0 R7=0 R8=0 R9=0 R10=fp0
11: (b7) r0 = 0 ; R0_w=0
12: (63) *(u32 *)(r10 -4) = r0
last_idx 12 first_idx 11
regs=1 stack=0 before 11: (b7) r0 = 0
13: R0_w=0 R10=fp0 fp-8=0000????
13: (18) r4 = 0xffff9290dc5bfe00 ; R4_w=map_ptr(off=0,ks=4,vs=48,imm=0)
15: (bf) r1 = r4 ; R1_w=map_ptr(off=0,ks=4,vs=48,imm=0) R4_w=map_ptr(off=0,ks=4,vs=48,imm=0)
16: (bf) r2 = r10 ; R2_w=fp0 R10=fp0
17: (07) r2 += -4 ; R2_w=fp-4
18: (85) call bpf_map_lookup_elem#1
frame 0: propagating r6
last_idx 19 first_idx 11
regs=40 stack=0 before 18: (85) call bpf_map_lookup_elem#1
regs=40 stack=0 before 17: (07) r2 += -4
regs=40 stack=0 before 16: (bf) r2 = r10
regs=40 stack=0 before 15: (bf) r1 = r4
regs=40 stack=0 before 13: (18) r4 = 0xffff9290dc5bfe00
regs=40 stack=0 before 12: (63) *(u32 *)(r10 -4) = r0
regs=40 stack=0 before 11: (b7) r0 = 0
parent didn't have regs=40 stack=0 marks: R1=ctx(off=0,imm=0) R6_r=P0 R7=0 R8=0 R9=0 R10=fp0
last_idx 9 first_idx 9
regs=40 stack=0 before 9: (bd) if r6 <= r9 goto pc+1
parent didn't have regs=240 stack=0 marks: R1=ctx(off=0,imm=0) R6_rw=Pscalar() R7_w=0 R8_w=0 R9_rw=P0 R10=fp0
last_idx 8 first_idx 0
regs=240 stack=0 before 8: (b7) r9 = 0
regs=40 stack=0 before 7: (97) r6 %= 1
regs=40 stack=0 before 6: (bd) if r6 <= r9 goto pc+2
regs=240 stack=0 before 5: (05) goto pc+0
regs=240 stack=0 before 4: (97) r6 %= 1025
regs=240 stack=0 before 3: (b7) r9 = -2147483648
regs=40 stack=0 before 2: (b7) r8 = 0
regs=40 stack=0 before 1: (b7) r7 = 0
regs=40 stack=0 before 0: (b7) r6 = 1024
19: safe
from 6 to 9: R1=ctx(off=0,imm=0) R6_w=scalar(umax=18446744071562067968) R7_w=0 R8_w=0 R9_w=-2147483648 R10=fp0
9: (bd) if r6 <= r9 goto pc+1
last_idx 9 first_idx 0
regs=40 stack=0 before 6: (bd) if r6 <= r9 goto pc+2
regs=240 stack=0 before 5: (05) goto pc+0
regs=240 stack=0 before 4: (97) r6 %= 1025
regs=240 stack=0 before 3: (b7) r9 = -2147483648
regs=40 stack=0 before 2: (b7) r8 = 0
regs=40 stack=0 before 1: (b7) r7 = 0
regs=40 stack=0 before 0: (b7) r6 = 1024
last_idx 9 first_idx 0
regs=200 stack=0 before 6: (bd) if r6 <= r9 goto pc+2
regs=240 stack=0 before 5: (05) goto pc+0
regs=240 stack=0 before 4: (97) r6 %= 1025
regs=240 stack=0 before 3: (b7) r9 = -2147483648
regs=40 stack=0 before 2: (b7) r8 = 0
regs=40 stack=0 before 1: (b7) r7 = 0
regs=40 stack=0 before 0: (b7) r6 = 1024
11: R6=scalar(umax=18446744071562067968) R9=-2147483648
11: (b7) r0 = 0 ; R0_w=0
12: (63) *(u32 *)(r10 -4) = r0
last_idx 12 first_idx 11
regs=1 stack=0 before 11: (b7) r0 = 0
13: R0_w=0 R10=fp0 fp-8=0000????
13: (18) r4 = 0xffff9290dc5bfe00 ; R4_w=map_ptr(off=0,ks=4,vs=48,imm=0)
15: (bf) r1 = r4 ; R1_w=map_ptr(off=0,ks=4,vs=48,imm=0) R4_w=map_ptr(off=0,ks=4,vs=48,imm=0)
16: (bf) r2 = r10 ; R2_w=fp0 R10=fp0
17: (07) r2 += -4 ; R2_w=fp-4
18: (85) call bpf_map_lookup_elem#1 ; R0_w=map_value_or_null(id=3,off=0,ks=4,vs=48,imm=0)
19: (55) if r0 != 0x0 goto pc+1 ; R0_w=0
20: (95) exit
from 19 to 21: R0=map_value(off=0,ks=4,vs=48,imm=0) R6=scalar(umax=18446744071562067968) R7=0 R8=0 R9=-2147483648 R10=fp0 fp-8=mmmm????
21: (77) r6 >>= 10 ; R6_w=scalar(umax=18014398507384832,var_off=(0x0; 0x3fffffffffffff))
22: (27) r6 *= 8192 ; R6_w=scalar(smax=9223372036854767616,umax=18446744073709543424,var_off=(0x0; 0xffffffffffffe000),s32_max=2147475456,u32_max=-8192)
23: (bf) r1 = r0 ; R0=map_value(off=0,ks=4,vs=48,imm=0) R1_w=map_value(off=0,ks=4,vs=48,imm=0)
24: (0f) r0 += r6
last_idx 24 first_idx 21
regs=40 stack=0 before 23: (bf) r1 = r0
regs=40 stack=0 before 22: (27) r6 *= 8192
regs=40 stack=0 before 21: (77) r6 >>= 10
parent didn't have regs=40 stack=0 marks: R0_rw=map_value(off=0,ks=4,vs=48,imm=0) R6_r=Pscalar(umax=18446744071562067968) R7=0 R8=0 R9=-2147483648 R10=fp0 fp-8=mmmm????
last_idx 19 first_idx 11
regs=40 stack=0 before 19: (55) if r0 != 0x0 goto pc+1
regs=40 stack=0 before 18: (85) call bpf_map_lookup_elem#1
regs=40 stack=0 before 17: (07) r2 += -4
regs=40 stack=0 before 16: (bf) r2 = r10
regs=40 stack=0 before 15: (bf) r1 = r4
regs=40 stack=0 before 13: (18) r4 = 0xffff9290dc5bfe00
regs=40 stack=0 before 12: (63) *(u32 *)(r10 -4) = r0
regs=40 stack=0 before 11: (b7) r0 = 0
parent didn't have regs=40 stack=0 marks: R1=ctx(off=0,imm=0) R6_rw=Pscalar(umax=18446744071562067968) R7_w=0 R8_w=0 R9_w=-2147483648 R10=fp0
last_idx 9 first_idx 0
regs=40 stack=0 before 9: (bd) if r6 <= r9 goto pc+1
regs=240 stack=0 before 6: (bd) if r6 <= r9 goto pc+2
regs=240 stack=0 before 5: (05) goto pc+0
regs=240 stack=0 before 4: (97) r6 %= 1025
regs=240 stack=0 before 3: (b7) r9 = -2147483648
regs=40 stack=0 before 2: (b7) r8 = 0
regs=40 stack=0 before 1: (b7) r7 = 0
regs=40 stack=0 before 0: (b7) r6 = 1024
math between map_value pointer and register with unbounded min value is not allowed
verification time 886 usec
stack depth 4
processed 49 insns (limit 1000000) max_states_per_insn 1 total_states 5 peak_states 5 mark_read 2
Fixes: b5dc0163d8 ("bpf: precise scalar_value tracking")
Reported-by: Juan Jose Lopez Jaimez <jjlopezjaimez@google.com>
Reported-by: Meador Inge <meadori@google.com>
Reported-by: Simon Scannell <simonscannell@google.com>
Reported-by: Nenad Stojanovski <thenenadx@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Co-developed-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Juan Jose Lopez Jaimez <jjlopezjaimez@google.com>
Reviewed-by: Meador Inge <meadori@google.com>
Reviewed-by: Simon Scannell <simonscannell@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Changes in 6.1.24
dm cache: Add some documentation to dm-cache-background-tracker.h
dm integrity: Remove bi_sector that's only used by commented debug code
dm: change "unsigned" to "unsigned int"
dm: fix improper splitting for abnormal bios
KVM: arm64: PMU: Align chained counter implementation with architecture pseudocode
KVM: arm64: PMU: Distinguish between 64bit counter and 64bit overflow
KVM: arm64: PMU: Sanitise PMCR_EL0.LP on first vcpu run
KVM: arm64: PMU: Don't save PMCR_EL0.{C,P} for the vCPU
gpio: GPIO_REGMAP: select REGMAP instead of depending on it
Drivers: vmbus: Check for channel allocation before looking up relids
ASoC: SOF: ipc4: Ensure DSP is in D0I0 during sof_ipc4_set_get_data()
pwm: Make .get_state() callback return an error code
pwm: hibvt: Explicitly set .polarity in .get_state()
pwm: cros-ec: Explicitly set .polarity in .get_state()
pwm: iqs620a: Explicitly set .polarity in .get_state()
pwm: sprd: Explicitly set .polarity in .get_state()
pwm: meson: Explicitly set .polarity in .get_state()
ASoC: codecs: lpass: fix the order or clks turn off during suspend
KVM: s390: pv: fix external interruption loop not always detected
wifi: mac80211: fix the size calculation of ieee80211_ie_len_eht_cap()
wifi: mac80211: fix invalid drv_sta_pre_rcu_remove calls for non-uploaded sta
net: qrtr: Fix a refcount bug in qrtr_recvmsg()
net: phylink: add phylink_expects_phy() method
net: stmmac: check if MAC needs to attach to a PHY
net: stmmac: remove redundant fixup to support fixed-link mode
l2tp: generate correct module alias strings
wifi: brcmfmac: Fix SDIO suspend/resume regression
NFSD: Avoid calling OPDESC() with ops->opnum == OP_ILLEGAL
nfsd: call op_release, even when op_func returns an error
icmp: guard against too small mtu
ALSA: hda/hdmi: Preserve the previous PCM device upon re-enablement
net: don't let netpoll invoke NAPI if in xmit context
net: dsa: mv88e6xxx: Reset mv88e6393x force WD event bit
sctp: check send stream number after wait_for_sndbuf
net: qrtr: Do not do DEL_SERVER broadcast after DEL_CLIENT
ipv6: Fix an uninit variable access bug in __ip6_make_skb()
platform/x86: think-lmi: Fix memory leak when showing current settings
platform/x86: think-lmi: Fix memory leaks when parsing ThinkStation WMI strings
platform/x86: think-lmi: Clean up display of current_value on Thinkstation
gpio: davinci: Do not clear the bank intr enable bit in save_context
gpio: davinci: Add irq chip flag to skip set wake
net: ethernet: ti: am65-cpsw: Fix mdio cleanup in probe
net: stmmac: fix up RX flow hash indirection table when setting channels
sunrpc: only free unix grouplist after RCU settles
NFSD: callback request does not use correct credential for AUTH_SYS
ice: fix wrong fallback logic for FDIR
ice: Reset FDIR counter in FDIR init stage
raw: use net_hash_mix() in hash function
raw: Fix NULL deref in raw_get_next().
ping: Fix potentail NULL deref for /proc/net/icmp.
ethtool: reset #lanes when lanes is omitted
netlink: annotate lockless accesses to nlk->max_recvmsg_len
gve: Secure enough bytes in the first TX desc for all TCP pkts
arm64: compat: Work around uninitialized variable warning
net: stmmac: check fwnode for phy device before scanning for phy
cxl/pci: Fix CDAT retrieval on big endian
cxl/pci: Handle truncated CDAT header
cxl/pci: Handle truncated CDAT entries
cxl/pci: Handle excessive CDAT length
PCI/DOE: Silence WARN splat with CONFIG_DEBUG_OBJECTS=y
PCI/DOE: Fix memory leak with CONFIG_DEBUG_OBJECTS=y
usb: xhci: tegra: fix sleep in atomic call
xhci: Free the command allocated for setting LPM if we return early
xhci: also avoid the XHCI_ZERO_64B_REGS quirk with a passthrough iommu
usb: cdnsp: Fixes error: uninitialized symbol 'len'
usb: dwc3: pci: add support for the Intel Meteor Lake-S
USB: serial: cp210x: add Silicon Labs IFS-USB-DATACABLE IDs
usb: typec: altmodes/displayport: Fix configure initial pin assignment
USB: serial: option: add Telit FE990 compositions
USB: serial: option: add Quectel RM500U-CN modem
drivers: iio: adc: ltc2497: fix LSB shift
iio: adis16480: select CONFIG_CRC32
iio: adc: qcom-spmi-adc5: Fix the channel name
iio: adc: ti-ads7950: Set `can_sleep` flag for GPIO chip
iio: dac: cio-dac: Fix max DAC write value check for 12-bit
iio: buffer: correctly return bytes written in output buffers
iio: buffer: make sure O_NONBLOCK is respected
iio: light: cm32181: Unregister second I2C client if present
tty: serial: sh-sci: Fix transmit end interrupt handler
tty: serial: sh-sci: Fix Rx on RZ/G2L SCI
tty: serial: fsl_lpuart: avoid checking for transfer complete when UARTCTRL_SBK is asserted in lpuart32_tx_empty
nilfs2: fix potential UAF of struct nilfs_sc_info in nilfs_segctor_thread()
nilfs2: fix sysfs interface lifetime
dt-bindings: serial: renesas,scif: Fix 4th IRQ for 4-IRQ SCIFs
serial: 8250: Prevent starting up DMA Rx on THRI interrupt
ksmbd: do not call kvmalloc() with __GFP_NORETRY | __GFP_NO_WARN
ksmbd: fix slab-out-of-bounds in init_smb2_rsp_hdr
ALSA: hda/realtek: Add quirk for Clevo X370SNW
ALSA: hda/realtek: fix mute/micmute LEDs for a HP ProBook
x86/acpi/boot: Correct acpi_is_processor_usable() check
x86/ACPI/boot: Use FADT version to check support for online capable
KVM: x86: Clear "has_error_code", not "error_code", for RM exception injection
KVM: nVMX: Do not report error code when synthesizing VM-Exit from Real Mode
mm: kfence: fix PG_slab and memcg_data clearing
mm: kfence: fix handling discontiguous page
coresight: etm4x: Do not access TRCIDR1 for identification
coresight-etm4: Fix for() loop drvdata->nr_addr_cmp range bug
counter: 104-quad-8: Fix race condition between FLAG and CNTR reads
counter: 104-quad-8: Fix Synapse action reported for Index signals
blk-mq: directly poll requests
iio: adc: ad7791: fix IRQ flags
io_uring: fix return value when removing provided buffers
io_uring: fix memory leak when removing provided buffers
scsi: qla2xxx: Fix memory leak in qla2x00_probe_one()
scsi: iscsi_tcp: Check that sock is valid before iscsi_set_param()
nvme: fix discard support without oncs
cifs: sanitize paths in cifs_update_super_prepath.
block: ublk: make sure that block size is set correctly
block: don't set GD_NEED_PART_SCAN if scan partition failed
perf/core: Fix the same task check in perf_event_set_output
ftrace: Mark get_lock_parent_ip() __always_inline
ftrace: Fix issue that 'direct->addr' not restored in modify_ftrace_direct()
fs: drop peer group ids under namespace lock
can: j1939: j1939_tp_tx_dat_new(): fix out-of-bounds memory access
can: isotp: fix race between isotp_sendsmg() and isotp_release()
can: isotp: isotp_ops: fix poll() to not report false EPOLLOUT events
can: isotp: isotp_recvmsg(): use sock_recv_cmsgs() to get SOCK_RXQ_OVFL infos
ACPI: video: Add auto_detect arg to __acpi_video_get_backlight_type()
ACPI: video: Make acpi_backlight=video work independent from GPU driver
ACPI: video: Add acpi_backlight=video quirk for Apple iMac14,1 and iMac14,2
ACPI: video: Add acpi_backlight=video quirk for Lenovo ThinkPad W530
net: stmmac: Add queue reset into stmmac_xdp_open() function
tracing/synthetic: Fix races on freeing last_cmd
tracing/timerlat: Notify new max thread latency
tracing/osnoise: Fix notify new tracing_max_latency
tracing: Free error logs of tracing instances
ASoC: hdac_hdmi: use set_stream() instead of set_tdm_slots()
tracing/synthetic: Make lastcmd_mutex static
zsmalloc: document freeable stats
mm: vmalloc: avoid warn_alloc noise caused by fatal signal
wifi: mt76: ignore key disable commands
ublk: read any SQE values upfront
drm/panfrost: Fix the panfrost_mmu_map_fault_addr() error path
drm/nouveau/disp: Support more modes by checking with lower bpc
drm/i915: Fix context runtime accounting
drm/i915: fix race condition UAF in i915_perf_add_config_ioctl
ring-buffer: Fix race while reader and writer are on the same page
mm/swap: fix swap_info_struct race between swapoff and get_swap_pages()
mm/hugetlb: fix uffd wr-protection for CoW optimization path
maple_tree: fix get wrong data_end in mtree_lookup_walk()
maple_tree: fix a potential concurrency bug in RCU mode
blk-throttle: Fix that bps of child could exceed bps limited in parent
drm/amd/display: Clear MST topology if it fails to resume
drm/amdgpu: for S0ix, skip SDMA 5.x+ suspend/resume
drm/amdgpu: skip psp suspend for IMU enabled ASICs mode2 reset
drm/display/dp_mst: Handle old/new payload states in drm_dp_remove_payload()
drm/i915/dp_mst: Fix payload removal during output disabling
drm/bridge: lt9611: Fix PLL being unable to lock
drm/i915: Use _MMIO_PIPE() for SKL_BOTTOM_COLOR
drm/i915: Split icl_color_commit_noarm() from skl_color_commit_noarm()
mm: take a page reference when removing device exclusive entries
maple_tree: remove GFP_ZERO from kmem_cache_alloc() and kmem_cache_alloc_bulk()
maple_tree: fix potential rcu issue
maple_tree: reduce user error potential
maple_tree: fix handle of invalidated state in mas_wr_store_setup()
maple_tree: fix mas_prev() and mas_find() state handling
maple_tree: be more cautious about dead nodes
maple_tree: refine ma_state init from mas_start()
maple_tree: detect dead nodes in mas_start()
maple_tree: fix freeing of nodes in rcu mode
maple_tree: remove extra smp_wmb() from mas_dead_leaves()
maple_tree: add smp_rmb() to dead node detection
maple_tree: add RCU lock checking to rcu callback functions
mm: enable maple tree RCU mode by default.
bpftool: Print newline before '}' for struct with padding only fields
Linux 6.1.24
Change-Id: I475408e1166927565c7788e7095bdf2cb236c4b2
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
[ Upstream commit eee87853794187f6adbe19533ed79c8b44b36a91 ]
In the case of CLONE_INTO_CGROUP, not all cpusets are ready to accept
new tasks. It is too late to check that in cpuset_fork(). So we need
to add the cpuset_can_fork() and cpuset_cancel_fork() methods to
pre-check it before we can allow attachment to a different cpuset.
We also need to set the attach_in_progress flag to alert other code
that a new task is going to be added to the cpuset.
Fixes: ef2c41cf38 ("clone3: allow spawning processes into cgroups")
Suggested-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Cc: stable@vger.kernel.org # v5.7+
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 42a11bf5c5436e91b040aeb04063be1710bb9f9c ]
By default, the clone(2) syscall spawn a child process into the same
cgroup as its parent. With the use of the CLONE_INTO_CGROUP flag
introduced by commit ef2c41cf38 ("clone3: allow spawning processes
into cgroups"), the child will be spawned into a different cgroup which
is somewhat similar to writing the child's tid into "cgroup.threads".
The current cpuset_fork() method does not properly handle the
CLONE_INTO_CGROUP case where the cpuset of the child may be different
from that of its parent. Update the cpuset_fork() method to treat the
CLONE_INTO_CGROUP case similar to cpuset_attach().
Since the newly cloned task has not been running yet, its actual
memory usage isn't known. So it is not necessary to make change to mm
in cpuset_fork().
Fixes: ef2c41cf38 ("clone3: allow spawning processes into cgroups")
Reported-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Cc: stable@vger.kernel.org # v5.7+
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 18f9a4d47527772515ad6cbdac796422566e6440 ]
Cpuset v2 has no spread flags to set. So we can skip spread
flags update if cpuset v2 is being used. Also change the name to
cpuset_update_task_spread_flags() to indicate that there are multiple
spread flags.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Stable-dep-of: 42a11bf5c543 ("cgroup/cpuset: Make cpuset_fork() handle CLONE_INTO_CGROUP properly")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 91dcf1e8068e9a8823e419a7a34ff4341275fb70 ]
When local group is fully busy but its average load is above system load,
computing the imbalance will overflow and local group is not the best
target for pulling this load.
Fixes: 0b0695f2b3 ("sched/fair: Rework load_balance()")
Reported-by: Tingjia Cao <tjcao980311@gmail.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Tingjia Cao <tjcao980311@gmail.com>
Link: https://lore.kernel.org/lkml/CABcWv9_DAhVBOq2=W=2ypKE9dKM5s2DvoV8-U0+GDwwuKZ89jQ@mail.gmail.com/T/
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit ba9182a89626d5f83c2ee4594f55cb9c1e60f0e2 upstream.
After a successful cpuset_can_attach() call which increments the
attach_in_progress flag, either cpuset_cancel_attach() or cpuset_attach()
will be called later. In cpuset_attach(), tasks in cpuset_attach_wq,
if present, will be woken up at the end. That is not the case in
cpuset_cancel_attach(). So missed wakeup is possible if the attach
operation is somehow cancelled. Fix that by doing the wakeup in
cpuset_cancel_attach() as well.
Fixes: e44193d39e ("cpuset: let hotplug propagation work wait for task attaching")
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Cc: stable@vger.kernel.org # v3.11+
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 292fd843de26c551856e66faf134512c52dd78b4 upstream.
It was found that commit 7a2127e66a00 ("cpuset: Call
set_cpus_allowed_ptr() with appropriate mask for task") introduced a bug
that corrupted "cpuset.cpus" of a partition root when it was updated.
It is because the tmp->new_cpus field of the passed tmp parameter
of update_parent_subparts_cpumask() should not be used at all as
it contains important cpumask data that should not be overwritten.
Fix it by using tmp->addmask instead.
Also update update_cpumask() to make sure that trialcs->cpu_allowed
will not be corrupted until it is no longer needed.
Fixes: 7a2127e66a00 ("cpuset: Call set_cpus_allowed_ptr() with appropriate mask for task")
Signed-off-by: Waiman Long <longman@redhat.com>
Cc: stable@vger.kernel.org # v6.2+
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fcdb1eda5302599045bb366e679cccb4216f3873 upstream.
We need to reset forceidle_sum to 0 when reading from root, since the
bstat we accumulate into is stack allocated.
To make this more robust, just replace the existing cputime reset with a
memset of the overall bstat.
Signed-off-by: Josh Don <joshdon@google.com>
Fixes: 1fcf54deb7 ("sched/core: add forced idle accounting for cgroups")
Cc: stable@vger.kernel.org # v6.0+
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9d52727f8043cfda241ae96896628d92fa9c50bb ]
If a trace instance has a failure with its snapshot code, the error
message is to be written to that instance's buffer. But currently, the
message is written to the top level buffer. Worse yet, it may also disable
the top level buffer and not the instance that had the issue.
Link: https://lkml.kernel.org/r/20230405022341.688730321@goodmis.org
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ross Zwisler <zwisler@google.com>
Fixes: 2824f50332 ("tracing: Make the snapshot trigger work with instances")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d503b8f7474fe7ac616518f7fc49773cbab49f36 ]
Add a generic trace_array_puts() that can be used to "trace_puts()" into
an allocated trace_array instance. This is just another variant of
trace_array_printk().
Link: https://lkml.kernel.org/r/20230207173026.584717290@goodmis.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Ross Zwisler <zwisler@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Stable-dep-of: 9d52727f8043 ("tracing: Have tracing_snapshot_instance_cond() write errors to the appropriate instance")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Add vendor hook for module init, so we can get memory type and
use it to do memory type check for architecture
dependent page table setting. To make sure the architecture
dependent tables are created correctly, we need to know when
module parts are initialized and their attributes.
For releasing modules, corresponding tables and attributes should be
destroyed and restored.
These hooks may be invoked in non-atomic context, so it's
necessary to use restricted ones.
Bug: 248994334
Change-Id: Ie9f415c36bca1fb98e021522b627e562d27cdef4
Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Add restricted vendor hook for creds, so we get the creds
information to monitor cred lifetime. During the lifetime,
we store the creds information in a standalone protected
memory and keep track of integrity.
These hooks may be invoked in non-atomic context, so it's
necessary to use restricted ones.
Bug: 248994334
Change-Id: I57fbb759452302fa1ba1e720c76bfe671eab96b5
Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
commit 3dd4432549415f3c65dd52d5c687629efbf4ece1 upstream.
Use the maple tree in RCU mode for VMA tracking.
The maple tree tracks the stack and is able to update the pivot
(lower/upper boundary) in-place to allow the page fault handler to write
to the tree while holding just the mmap read lock. This is safe as the
writes to the stack have a guard VMA which ensures there will always be
a NULL in the direction of the growth and thus will only update a pivot.
It is possible, but not recommended, to have VMAs that grow up/down
without guard VMAs. syzbot has constructed a testcase which sets up a
VMA to grow and consume the empty space. Overwriting the entire NULL
entry causes the tree to be altered in a way that is not safe for
concurrent readers; the readers may see a node being rewritten or one
that does not match the maple state they are using.
Enabling RCU mode allows the concurrent readers to see a stable node and
will return the expected result.
Link: https://lkml.kernel.org/r/20230227173632.3292573-9-surenb@google.com
Cc: stable@vger.kernel.org
Fixes: d4af56c5c7 ("mm: start tracking VMAs with maple tree")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: syzbot+8d95422d3537159ca390@syzkaller.appspotmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6455b6163d8c680366663cdb8c679514d55fc30c upstream.
When user reads file 'trace_pipe', kernel keeps printing following logs
that warn at "cpu_buffer->reader_page->read > rb_page_size(reader)" in
rb_get_reader_page(). It just looks like there's an infinite loop in
tracing_read_pipe(). This problem occurs several times on arm64 platform
when testing v5.10 and below.
Call trace:
rb_get_reader_page+0x248/0x1300
rb_buffer_peek+0x34/0x160
ring_buffer_peek+0xbc/0x224
peek_next_entry+0x98/0xbc
__find_next_entry+0xc4/0x1c0
trace_find_next_entry_inc+0x30/0x94
tracing_read_pipe+0x198/0x304
vfs_read+0xb4/0x1e0
ksys_read+0x74/0x100
__arm64_sys_read+0x24/0x30
el0_svc_common.constprop.0+0x7c/0x1bc
do_el0_svc+0x2c/0x94
el0_svc+0x20/0x30
el0_sync_handler+0xb0/0xb4
el0_sync+0x160/0x180
Then I dump the vmcore and look into the problematic per_cpu ring_buffer,
I found that tail_page/commit_page/reader_page are on the same page while
reader_page->read is obviously abnormal:
tail_page == commit_page == reader_page == {
.write = 0x100d20,
.read = 0x8f9f4805, // Far greater than 0xd20, obviously abnormal!!!
.entries = 0x10004c,
.real_end = 0x0,
.page = {
.time_stamp = 0x857257416af0,
.commit = 0xd20, // This page hasn't been full filled.
// .data[0...0xd20] seems normal.
}
}
The root cause is most likely the race that reader and writer are on the
same page while reader saw an event that not fully committed by writer.
To fix this, add memory barriers to make sure the reader can see the
content of what is committed. Since commit a0fcaaed0c ("ring-buffer: Fix
race between reset page and reading page") has added the read barrier in
rb_get_reader_page(), here we just need to add the write barrier.
Link: https://lore.kernel.org/linux-trace-kernel/20230325021247.2923907-1-zhengyejian1@huawei.com
Cc: stable@vger.kernel.org
Fixes: 77ae365eca ("ring-buffer: make lockless")
Suggested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 31c683967174b487939efaf65e41f5ff1404e141 upstream.
The lastcmd_mutex is only used in trace_events_synth.c and should be
static.
Link: https://lore.kernel.org/linux-trace-kernel/202304062033.cRStgOuP-lkp@intel.com/
Link: https://lore.kernel.org/linux-trace-kernel/20230406111033.6e26de93@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
Fixes: 4ccf11c4e8a8e ("tracing/synthetic: Fix races on freeing last_cmd")
Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d3cba7f02cd82118c32651c73374d8a5a459d9a6 upstream.
osnoise/timerlat tracers are reporting new max latency on instances
where the tracing is off, creating inconsistencies between the max
reported values in the trace and in the tracing_max_latency. Thus
only report new tracing_max_latency on active tracing instances.
Link: https://lkml.kernel.org/r/ecd109fde4a0c24ab0f00ba1e9a144ac19a91322.1680104184.git.bristot@kernel.org
Cc: stable@vger.kernel.org
Fixes: dae181349f ("tracing/osnoise: Support a list of trace_array *tr")
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b9f451a9029a16eb7913ace09b92493d00f2e564 upstream.
timerlat is not reporting a new tracing_max_latency for the thread
latency. The reason is that it is not calling notify_new_max_latency()
function after the new thread latency is sampled.
Call notify_new_max_latency() after computing the thread latency.
Link: https://lkml.kernel.org/r/16e18d61d69073d0192ace07bf61e405cca96e9c.1680104184.git.bristot@kernel.org
Cc: stable@vger.kernel.org
Fixes: dae181349f ("tracing/osnoise: Support a list of trace_array *tr")
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>