android_kernel_msm-6.1_noth.../arch
Thomas Gleixner edadebb349 x86/smp: Make stop_other_cpus() more robust
commit 1f5e7eb7868e42227ac426c96d437117e6e06e8e upstream.

Tony reported intermittent lockups on poweroff. His analysis identified the
wbinvd() in stop_this_cpu() as the culprit. This was added to ensure that
on SME enabled machines a kexec() does not leave any stale data in the
caches when switching from encrypted to non-encrypted mode or vice versa.

That wbinvd() is conditional on the SME feature bit which is read directly
from CPUID. But that readout does not check whether the CPUID leaf is
available or not. If it's not available the CPU will return the value of
the highest supported leaf instead. Depending on the content the "SME" bit
might be set or not.

That's incorrect but harmless. Making the CPUID readout conditional makes
the observed hangs go away, but it does not fix the underlying problem:

CPU0					CPU1

 stop_other_cpus()
   send_IPIs(REBOOT);			stop_this_cpu()
   while (num_online_cpus() > 1);         set_online(false);
   proceed... -> hang
				          wbinvd()

WBINVD is an expensive operation and if multiple CPUs issue it at the same
time the resulting delays are even larger.

But CPU0 already observed num_online_cpus() going down to 1 and proceeds
which causes the system to hang.

This issue exists independent of WBINVD, but the delays caused by WBINVD
make it more prominent.

Make this more robust by adding a cpumask which is initialized to the
online CPU mask before sending the IPIs and CPUs clear their bit in
stop_this_cpu() after the WBINVD completed. Check for that cpumask to
become empty in stop_other_cpus() instead of watching num_online_cpus().

The cpumask cannot plug all holes either, but it's better than a raw
counter and allows to restrict the NMI fallback IPI to be sent only the
CPUs which have not reported within the timeout window.

Fixes: 08f253ec37 ("x86/cpu: Clear SME feature flag when not in use")
Reported-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/3817d810-e0f1-8ef8-0bbd-663b919ca49b@cybernetics.com
Link: https://lore.kernel.org/r/87h6r770bv.ffs@tglx
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-01 13:16:23 +02:00
..
alpha alpha: fix R_ALPHA_LITERAL reloc for large modules 2023-03-17 08:50:31 +01:00
arc ARC: mm: fix leakage of memory allocated for PTE 2022-10-17 16:32:12 -07:00
arm ARM: dts: Fix erroneous ADS touchscreen polarities 2023-06-28 11:12:39 +02:00
arm64 KVM: arm64: Restore GICv2-on-GICv3 functionality 2023-06-28 11:12:40 +02:00
csky - hfs and hfsplus kmap API modernization from Fabio Francesco 2022-10-12 11:00:22 -07:00
hexagon - hfs and hfsplus kmap API modernization from Fabio Francesco 2022-10-12 11:00:22 -07:00
ia64 ia64: fix an addr to taddr in huge_pte_offset() 2023-05-11 23:03:40 +09:00
loongarch LoongArch: Fix perf event id calculation 2023-06-21 16:00:54 +02:00
m68k m68k: Move signal frame following exception on 68020/030 2023-05-30 14:03:18 +01:00
microblaze kbuild: fix "cat: .version: No such file or directory" 2022-11-24 09:26:02 +09:00
mips MIPS: Prefer cc-option for additions to cflags 2023-06-21 16:01:03 +02:00
nios2 nios2: dts: Fix tse_mac "max-frame-size" property 2023-06-21 16:00:54 +02:00
openrisc openrisc: Properly store r31 to pt_regs on unhandled exceptions 2023-05-11 23:03:35 +09:00
parisc parisc: Delete redundant register definitions in <asm/assembly.h> 2023-06-21 16:01:02 +02:00
powerpc powerpc/purgatory: remove PGO flags 2023-06-21 16:00:55 +02:00
riscv riscv/purgatory: remove PGO flags 2023-06-21 16:00:55 +02:00
s390 s390/purgatory: disable branch profiling 2023-06-28 11:12:38 +02:00
sh sh: nmi_debug: fix return value of __setup handler 2023-05-17 11:53:45 +02:00
sparc sparc: allow PM configs for sparc32 COMPILE_TEST 2023-03-10 09:33:27 +01:00
um um: harddog: fix modular build 2023-06-09 10:34:10 +02:00
x86 x86/smp: Make stop_other_cpus() more robust 2023-07-01 13:16:23 +02:00
xtensa xtensa: add __bswap{si,di}2 helpers 2023-05-30 14:03:18 +01:00
.gitignore
Kconfig ftrace: Allow WITH_ARGS flavour of graph tracer with shadow call stack 2022-12-31 13:32:45 +01:00