linux-kernelorg-stable

History

Luis Gerhorst d6f1c85f22 bpf: Fall back to nospec for Spectre v1 This implements the core of the series and causes the verifier to fall back to mitigating Spectre v1 using speculation barriers. The approach was presented at LPC'24 [1] and RAID'24 [2]. If we find any forbidden behavior on a speculative path, we insert a nospec (e.g., lfence speculation barrier on x86) before the instruction and stop verifying the path. While verifying a speculative path, we can furthermore stop verification of that path whenever we encounter a nospec instruction. A minimal example program would look as follows: A = true B = true if A goto e f() if B goto e unsafe() e: exit There are the following speculative and non-speculative paths (`cur->speculative` and `speculative` referring to the value of the push_stack() parameters): - A = true - B = true - if A goto e - A && !cur->speculative && !speculative - exit - !A && !cur->speculative && speculative - f() - if B goto e - B && cur->speculative && !speculative - exit - !B && cur->speculative && speculative - unsafe() If f() contains any unsafe behavior under Spectre v1 and the unsafe behavior matches `state->speculative && error_recoverable_with_nospec(err)`, do_check() will now add a nospec before f() instead of rejecting the program: A = true B = true if A goto e nospec f() if B goto e unsafe() e: exit Alternatively, the algorithm also takes advantage of nospec instructions inserted for other reasons (e.g., Spectre v4). Taking the program above as an example, speculative path exploration can stop before f() if a nospec was inserted there because of Spectre v4 sanitization. In this example, all instructions after the nospec are dead code (and with the nospec they are also dead code speculatively). For this, it relies on the fact that speculation barriers generally prevent all later instructions from executing if the speculation was not correct: * On Intel x86_64, lfence acts as full speculation barrier, not only as a load fence [3]: An LFENCE instruction or a serializing instruction will ensure that no later instructions execute, even speculatively, until all prior instructions complete locally. [...] Inserting an LFENCE instruction after a bounds check prevents later operations from executing before the bound check completes. This was experimentally confirmed in [4]. * On AMD x86_64, lfence is dispatch-serializing [5] (requires MSR C001_1029[1] to be set if the MSR is supported, this happens in init_amd()). AMD further specifies "A dispatch serializing instruction forces the processor to retire the serializing instruction and all previous instructions before the next instruction is executed" [8]. As dispatch is not specific to memory loads or branches, lfence therefore also affects all instructions there. Also, if retiring a branch means it's PC change becomes architectural (should be), this means any "wrong" speculation is aborted as required for this series. * ARM's SB speculation barrier instruction also affects "any instruction that appears later in the program order than the barrier" [6]. * PowerPC's barrier also affects all subsequent instructions [7]: [...] executing an ori R31,R31,0 instruction ensures that all instructions preceding the ori R31,R31,0 instruction have completed before the ori R31,R31,0 instruction completes, and that no subsequent instructions are initiated, even out-of-order, until after the ori R31,R31,0 instruction completes. The ori R31,R31,0 instruction may complete before storage accesses associated with instructions preceding the ori R31,R31,0 instruction have been performed Regarding the example, this implies that `if B goto e` will not execute before `if A goto e` completes. Once `if A goto e` completes, the CPU should find that the speculation was wrong and continue with `exit`. If there is any other path that leads to `if B goto e` (and therefore `unsafe()`) without going through `if A goto e`, then a nospec will still be needed there. However, this patch assumes this other path will be explored separately and therefore be discovered by the verifier even if the exploration discussed here stops at the nospec. This patch furthermore has the unfortunate consequence that Spectre v1 mitigations now only support architectures which implement BPF_NOSPEC. Before this commit, Spectre v1 mitigations prevented exploits by rejecting the programs on all architectures. Because some JITs do not implement BPF_NOSPEC, this patch therefore may regress unpriv BPF's security to a limited extent: * The regression is limited to systems vulnerable to Spectre v1, have unprivileged BPF enabled, and do NOT emit insns for BPF_NOSPEC. The latter is not the case for x86 64- and 32-bit, arm64, and powerpc 64-bit and they are therefore not affected by the regression. According to commit `a6f6a95f25` ("LoongArch, bpf: Fix jit to skip speculation barrier opcode"), LoongArch is not vulnerable to Spectre v1 and therefore also not affected by the regression. * To the best of my knowledge this regression may therefore only affect MIPS. This is deemed acceptable because unpriv BPF is still disabled there by default. As stated in a previous commit, BPF_NOSPEC could be implemented for MIPS based on GCC's speculation_barrier implementation. * It is unclear which other architectures (besides x86 64- and 32-bit, ARM64, PowerPC 64-bit, LoongArch, and MIPS) supported by the kernel are vulnerable to Spectre v1. Also, it is not clear if barriers are available on these architectures. Implementing BPF_NOSPEC on these architectures therefore is non-trivial. Searching GCC and the kernel for speculation barrier implementations for these architectures yielded no result. * If any of those regressed systems is also vulnerable to Spectre v4, the system was already vulnerable to Spectre v4 attacks based on unpriv BPF before this patch and the impact is therefore further limited. As an alternative to regressing security, one could still reject programs if the architecture does not emit BPF_NOSPEC (e.g., by removing the empty BPF_NOSPEC-case from all JITs except for LoongArch where it appears justified). However, this will cause rejections on these archs that are likely unfounded in the vast majority of cases. In the tests, some are now successful where we previously had a false-positive (i.e., rejection). Change them to reflect where the nospec should be inserted (using __xlated_unpriv) and modify the error message if the nospec is able to mitigate a problem that previously shadowed another problem (in that case __xlated_unpriv does not work, therefore just add a comment). Define SPEC_V1 to avoid duplicating this ifdef whenever we check for nospec insns using __xlated_unpriv, define it here once. This also improves readability. PowerPC can probably also be added here. However, omit it for now because the BPF CI currently does not include a test. Limit it to EPERM, EACCES, and EINVAL (and not everything except for EFAULT and ENOMEM) as it already has the desired effect for most real-world programs. Briefly went through all the occurrences of EPERM, EINVAL, and EACCESS in verifier.c to validate that catching them like this makes sense. Thanks to Dustin for their help in checking the vendor documentation. [1] https://lpc.events/event/18/contributions/1954/ ("Mitigating Spectre-PHT using Speculation Barriers in Linux eBPF") [2] https://arxiv.org/pdf/2405.00078 ("VeriFence: Lightweight and Precise Spectre Defenses for Untrusted Linux Kernel Extensions") [3] https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/runtime-speculative-side-channel-mitigations.html ("Managed Runtime Speculative Execution Side Channel Mitigations") [4] https://dl.acm.org/doi/pdf/10.1145/3359789.3359837 ("Speculator: a tool to analyze speculative execution attacks and mitigations" - Section 4.6 "Stopping Speculative Execution") [5] https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/software-techniques-for-managing-speculation.pdf ("White Paper - SOFTWARE TECHNIQUES FOR MANAGING SPECULATION ON AMD PROCESSORS - REVISION 5.09.23") [6] https://developer.arm.com/documentation/ddi0597/2020-12/Base-Instructions/SB--Speculation-Barrier- ("SB - Speculation Barrier - Arm Armv8-A A32/T32 Instruction Set Architecture (2020-12)") [7] https://wiki.raptorcs.com/w/images/5/5f/OPF_PowerISA_v3.1C.pdf ("Power ISA™ - Version 3.1C - May 26, 2024 - Section 9.2.1 of Book III") [8] https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/40332.pdf ("AMD64 Architecture Programmer’s Manual Volumes 1–5 - Revision 4.08 - April 2024 - 7.6.4 Serializing Instructions") Signed-off-by: Luis Gerhorst <luis.gerhorst@fau.de> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Henriette Herzog <henriette.herzog@rub.de> Cc: Dustin Nguyen <nguyen@cs.fau.de> Cc: Maximilian Ott <ott@cs.fau.de> Cc: Milan Stephan <milan.stephan@fau.de> Link: https://lore.kernel.org/r/20250603212428.338473-1-luis.gerhorst@fau.de Signed-off-by: Alexei Starovoitov <ast@kernel.org>		2025-06-09 20:11:10 -07:00
..
acct	selftests: acct: Add ksft_exit_skip if not running as root	2025-01-14 17:06:31 -07:00
alsa	selftests/alsa: Fix circular dependency involving global-timer	2024-12-20 10:00:41 +01:00
amd-pstate	…
arm64	kselftest/arm64: Set default OUTPUT path when undefined	2025-05-16 15:15:13 +01:00
bpf	bpf: Fall back to nospec for Spectre v1	2025-06-09 20:11:10 -07:00
breakpoints	…
cachestat	…
capabilities	…
cgroup	Generic:	2025-06-02 12:24:58 -07:00
clone3	selftests/pidfd: fixes syscall number defines	2025-03-25 14:59:05 +01:00
connector	…
core	…
coredump	selftests/coredump: add tests for AF_UNIX coredumps	2025-05-21 13:59:12 +02:00
cpu-hotplug	…
cpufreq	kselftest: cpufreq: Get rid of double suspend in rtcwake case	2025-05-09 12:43:39 -06:00
damon	selftests/damon/_damon_sysfs: skip testcases if CONFIG_DAMON_SYSFS is disabled	2025-05-31 22:46:15 -07:00
devices	…
dma	…
dmabuf-heaps	…
drivers	net: devmem: ncdevmem: remove unused variable	2025-05-27 19:19:36 -07:00
dt	…
efivarfs	selftests/efivarfs: add concurrent update tests	2025-01-21 16:34:41 +01:00
exec	AT_EXECVE_CHECK update for v6.14-rc1 (fix1)	2025-01-31 17:12:31 -08:00
fchmodat2	…
filelock	…
filesystems	- The 3 patch series "hung_task: extend blocking task stacktrace dump to	2025-05-31 19:12:53 -07:00
firmware	…
fpu	…
ftrace	selftests/ftrace: Convert poll to a gen_file	2025-05-09 12:43:10 -06:00
futex	selftests/futex: Fix spelling mistake "unitiliazed" -> "uninitialized"	2025-05-21 13:57:41 +02:00
gpio	selftests: gpio: gpio-aggregator: add a test case for _sysfs prefix reservation	2025-04-14 22:30:01 +02:00
hid	lib/crc: remove unnecessary prompt for CONFIG_CRC_T10DIF	2025-04-04 11:31:42 -07:00
ia64	…
intel_pstate	…
iommu	iommufd: Test attach before detaching pasid	2025-03-28 11:40:41 -03:00
ipc	selftests/ipc: Remove unused variables	2025-01-14 17:06:31 -07:00
ir	…
kcmp	…
kexec	selftests/kexec: Add x86_64 selftest for kexec-jump and exception handling	2025-04-10 12:17:14 +02:00
kmod	lib/test_kmod: do not hardcode/depend on any filesystem	2025-05-11 17:54:09 -07:00
kselftest	printf: convert self-test to KUnit	2025-03-13 10:26:33 -07:00
kselftest_harness	selftests: harness: Add kselftest harness selftest	2025-05-21 15:32:27 +02:00
kvm	KVM SVM changes for 6.16:	2025-05-27 12:15:49 -04:00
landlock	selftests/landlock: Add PID tests for audit records	2025-04-11 12:53:22 +02:00
lib	lib/prime_numbers: KUnit test should not select PRIME_NUMBERS	2025-04-15 13:50:43 -07:00
livepatch	Livepatching changes for 6.15	2025-03-27 19:26:10 -07:00
lkdtm	…
locking	…
lsm	selftests: refactor the lsm `flags_overset_lsm_set_self_attr` test	2024-12-18 18:14:29 -05:00
media_tests	selftest: media_tests: fix trivial UAF typo	2025-01-14 17:06:31 -07:00
membarrier	…
memfd	selftests/memfd/memfd_test: fix possible NULL pointer dereference	2025-01-25 20:22:44 -08:00
memory-hotplug	…
mincore	31 hotfixes. 9 are cc:stable and the remainder address post-6.15 issues	2025-04-16 20:07:32 -07:00
mm	- The 2 patch series "zram: support algorithm-specific parameters" from	2025-06-02 16:00:26 -07:00
module	…
mount	…
mount_setattr	selftests/mount_settattr: remove duplicate syscall definitions	2025-05-12 11:40:12 +02:00
move_mount_set_group	…
mqueue	…
mseal_system_mappings	selftest: test system mappings are sealed	2025-04-01 15:17:16 -07:00
nci	selftests: nci: Fix "Electrnoics" to "Electronics"	2025-05-20 18:13:43 -07:00
net	selftests: netfilter: Fix skip of wildcard interface test	2025-05-28 09:48:41 +02:00
nolibc	selftests/nolibc: drop include guards around standard headers	2025-05-21 15:32:27 +02:00
ntb	…
openat2	…
pci_endpoint	misc: pci_endpoint_test: Add support for PCITEST_IRQ_TYPE_AUTO	2025-03-26 06:11:54 +00:00
pcie_bwctrl	selftests/pcie_bwctrl: Fix test progs list	2025-04-18 08:23:22 -05:00
perf_events	selftests/perf_events: Fix spelling mistake "sycnhronize" -> "synchronize"	2025-04-29 13:35:55 -06:00
pid_namespace	selftests: pid_namespace: add missing sys/mount.h include in pid_max.c	2025-05-09 13:12:33 -06:00
pidfd	vfs-6.16-rc1.selftests	2025-05-26 11:32:28 -07:00
power_supply	…
powerpc	powerpc updates for 6.15	2025-03-27 19:39:08 -07:00
prctl	…
proc	…
pstore	…
ptp	testptp: Add option to open PHC in readonly mode	2025-03-05 12:43:54 +00:00
ptrace	selftests/ptrace: add a test case for PTRACE_SET_SYSCALL_INFO	2025-05-11 17:48:16 -07:00
rcutorture	rcutorture: Fix issue with re-using old images on ARM64	2025-05-16 11:15:34 -04:00
resctrl	selftests/resctrl: Discover SNC kernel support and adjust messages	2025-01-14 17:06:32 -07:00
ring-buffer	selftests/ring-buffer: Add test for out-of-bound pgoff mapping	2025-01-14 17:06:32 -07:00
riscv	selftests: riscv: fix v_exec_initval_nolibc.c	2025-04-01 07:03:04 +00:00
rlimits	…
rseq	rseq/selftests: Fix namespace collision with rseq UAPI header	2025-03-19 21:26:24 +01:00
rtc	rtc: remove 'setdate' test program	2025-04-01 15:25:15 +02:00
rust	…
safesetid	…
sched	sched/debug: Remove CONFIG_SCHED_DEBUG from self-test config files	2025-03-19 22:23:24 +01:00
sched_ext	selftests/sched_ext: Update test enq_select_cpu_fails	2025-05-21 07:35:58 -10:00
seccomp	selftests: seccomp: Fix "performace" to "performance"	2025-05-20 13:16:39 -07:00
sgx	…
signal	…
size	…
sparc64	…
splice	…
static_keys	…
sync	…
syscall_user_dispatch	…
sysctl	sysctl: Add 0012 to test the u8 range check	2025-04-14 14:13:41 +02:00
tc-testing	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2025-05-28 10:11:15 +02:00
tdx	…
thermal/intel	selftests: fix some typos in tools/testing/selftests	2025-05-11 17:54:13 -07:00
timens	selftests/timens: timerfd: Use correct clockid type in tclock_gettime()	2025-05-09 13:12:57 -06:00
timers	selftests/timers: Improve skew_consistency by testing with other clockids	2025-03-21 19:16:18 +01:00
tmpfs	selftests: tmpfs: Add kselftest support to tmpfs	2025-01-14 17:06:32 -07:00
tpm2	selftests: tpm2: test_smoke: use POSIX-conformant expression operator	2025-04-08 14:56:13 -06:00
tty	…
turbostat	…
ublk	selftests: ublk: add test for UBLK_F_QUIESCE	2025-05-23 09:42:12 -06:00
uevent	…
user_events	selftests/user_events: Fix failures caused by test code	2025-02-24 16:37:17 -07:00
vDSO	Updates for the VDSO infrastructure:	2025-03-25 11:30:42 -07:00
watchdog	…
wireguard	wireguard: selftests: specify -std=gnu17 for bash	2025-05-27 09:06:19 +02:00
x86	Merge commit 'its-for-linus-20250509-merge' into x86/core, to resolve conflicts	2025-05-13 10:47:10 +02:00
zram	selftests/zram: gitignore output file	2025-01-14 17:06:31 -07:00
.gitignore	selftests: tpm2: create a dedicated .gitignore	2025-04-08 14:56:13 -06:00
Makefile	Networking changes for 6.16.	2025-05-28 15:24:36 -07:00
gen_kselftest_tar.sh	…
kselftest.h	Revert "selftests: kselftest: Fix build failure with NOLIBC"	2025-02-26 22:13:48 +01:00
kselftest_deps.sh	…
kselftest_harness.h	selftests: harness: Stop using setjmp()/longjmp()	2025-05-21 15:32:37 +02:00
kselftest_install.sh	…
kselftest_module.h	…
lib.mk	selftests: Add headers target	2025-03-03 20:00:12 +01:00
run_kselftest.sh	selftests/run_kselftest.sh: Use readlink if realpath is not available	2025-05-15 16:52:47 -06:00