Centos-kernel-stream-9/kernel
Rafael Aquini a3f0ff645d x86/shstk: Introduce map_shadow_stack syscall
JIRA: https://issues.redhat.com/browse/RHEL-27743
Conflicts:
  * arch/x86/entry/syscalls/syscall_64.tbl: context differences due to
      merge conflict resolution done upstream (commit df57721f9a63) and
      the RHEL backport commits ec732a6c2d ("futex: Add sys_futex_wake()"),
      04a061fd44 ("futex: Add sys_futex_wait()"), and 99861cf1f6 ("futex:
      Add sys_futex_requeue()");
  * arch/x86/include/uapi/asm/mman.h: minor context difference due to RHEL
      backport commit f9f509b1e5 ("x86: Remove the arch_calc_vm_prot_bits()
      macro from the UAPI");

This patch is a backport of the following upstream commit:
commit c35559f94ebc3e3bc82e56e07161bb5986cd9761
Author: Rick Edgecombe <rick.p.edgecombe@intel.com>
Date:   Mon Jun 12 17:11:00 2023 -0700

    x86/shstk: Introduce map_shadow_stack syscall

    When operating with shadow stacks enabled, the kernel will automatically
    allocate shadow stacks for new threads, however in some cases userspace
    will need additional shadow stacks. The main example of this is the
    ucontext family of functions, which require userspace allocating and
    pivoting to userspace managed stacks.

    Unlike most other user memory permissions, shadow stacks need to be
    provisioned with special data in order to be useful. They need to be setup
    with a restore token so that userspace can pivot to them via the RSTORSSP
    instruction. But, the security design of shadow stacks is that they
    should not be written to except in limited circumstances. This presents a
    problem for userspace, as to how userspace can provision this special
    data, without allowing for the shadow stack to be generally writable.

    Previously, a new PROT_SHADOW_STACK was attempted, which could be
    mprotect()ed from RW permissions after the data was provisioned. This was
    found to not be secure enough, as other threads could write to the
    shadow stack during the writable window.

    The kernel can use a special instruction, WRUSS, to write directly to
    userspace shadow stacks. So the solution can be that memory can be mapped
    as shadow stack permissions from the beginning (never generally writable
    in userspace), and the kernel itself can write the restore token.

    First, a new madvise() flag was explored, which could operate on the
    PROT_SHADOW_STACK memory. This had a couple of downsides:
    1. Extra checks were needed in mprotect() to prevent writable memory from
       ever becoming PROT_SHADOW_STACK.
    2. Extra checks/vma state were needed in the new madvise() to prevent
       restore tokens being written into the middle of pre-used shadow stacks.
       It is ideal to prevent restore tokens being added at arbitrary
       locations, so the check was to make sure the shadow stack had never been
       written to.
    3. It stood out from the rest of the madvise flags, as more of direct
       action than a hint at future desired behavior.

    So rather than repurpose two existing syscalls (mmap, madvise) that don't
    quite fit, just implement a new map_shadow_stack syscall to allow
    userspace to map and setup new shadow stacks in one step. While ucontext
    is the primary motivator, userspace may have other unforeseen reasons to
    setup its own shadow stacks using the WRSS instruction. Towards this
    provide a flag so that stacks can be optionally setup securely for the
    common case of ucontext without enabling WRSS. Or potentially have the
    kernel set up the shadow stack in some new way.

    The following example demonstrates how to create a new shadow stack with
    map_shadow_stack:
    void *shstk = map_shadow_stack(addr, stack_size, SHADOW_STACK_SET_TOKEN);

    Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
    Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
    Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
    Reviewed-by: Kees Cook <keescook@chromium.org>
    Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
    Tested-by: Pengfei Xu <pengfei.xu@intel.com>
    Tested-by: John Allen <john.allen@amd.com>
    Tested-by: Kees Cook <keescook@chromium.org>
    Link: https://lore.kernel.org/all/20230613001108.3040476-35-rick.p.edgecombe%40intel.com

Signed-off-by: Rafael Aquini <raquini@redhat.com>
2024-10-01 11:17:15 -04:00
..
bpf bpf: Add sockptr support for setsockopt 2024-07-02 09:55:34 -04:00
cgroup Merge: mm: update core code to v6.5 upstream 2024-09-26 17:54:14 +00:00
configs mm/slab: rename CONFIG_SLAB to CONFIG_SLAB_DEPRECATED 2024-09-05 20:35:59 -04:00
debug module: Move kdb module related code out of main kdb code 2024-06-17 14:17:13 -04:00
dma dma: fix DMA sync for drivers not calling dma_set_mask*() 2024-06-25 15:16:44 -07:00
entry
events Merge: mm: update core code to v6.5 upstream 2024-09-26 17:54:14 +00:00
futex Revert "Revert "Merge: cgroup: Backport upstream cgroup commits up to v6.8"" 2024-05-18 21:38:20 -04:00
gcov
irq Merge: Update ACPI to match v6.10 2024-09-18 14:45:36 +02:00
kcsan printk: export console trace point for kcsan/kasan/kfence/kmsan 2024-05-09 11:26:20 -04:00
livepatch kallsyms: Delete an unused parameter related to {module_}kallsyms_on_each_symbol() 2024-06-17 14:17:23 -04:00
locking locktorture: Increase Hamming distance between call_rcu_chain and rcu_call_chains 2024-05-22 19:52:16 -04:00
module kunit: add KUNIT_INIT_TABLE to init linker section 2024-07-31 20:32:28 -06:00
power Merge: cgroup/cpuset: Relax restrictions on usage of cpuset.cpus.exclusive 2024-06-27 13:57:22 +00:00
printk prinkt/nbcon: Add a scheduling point to nbcon_kthread_func(). 2024-07-02 08:40:22 -04:00
rcu rcu: Restrict access to RCU CPU stall notifiers 2024-05-31 10:56:18 -04:00
sched Merge: sched/isolation: Prevent boot crash when the boot CPU is nohz_full 2024-08-06 14:31:33 +00:00
time Merge: Kunit update for 9.5 2024-08-15 12:22:20 +00:00
trace tracing/osnoise: Fix build when timerlat is not enabled 2024-09-16 13:28:09 +02:00
.gitignore
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
Makefile kallsyms: move kallsyms_show_value() out of kallsyms.c 2024-06-17 14:17:27 -04:00
acct.c
async.c async: Introduce async_schedule_dev_nocall() 2024-06-17 12:03:48 -04:00
audit.c Merge: audit: Send netlink ACK before setting connection in auditd_set 2024-08-16 14:22:20 +00:00
audit.h
audit_fsnotify.c
audit_tree.c audit: Annotate struct audit_chunk with __counted_by 2024-07-04 14:52:57 -03:00
audit_watch.c
auditfilter.c audit: remove unnecessary assignment in audit_dupe_lsm_field() 2024-07-04 14:53:06 -03:00
auditsc.c audit,io_uring: io_uring openat triggers audit reference count underflow 2024-07-04 14:53:02 -03:00
backtracetest.c backtracetest: Convert from tasklet to BH workqueue 2024-06-18 17:26:55 -03:00
bounds.c bounds: Use the right number of bits for power-of-two CONFIG_NR_CPUS 2024-05-20 16:15:24 -04:00
capability.c
cfi.c cfi: Remove CONFIG_CFI_CLANG_SHADOW 2024-06-17 14:17:20 -04:00
compat.c
configs.c
context_tracking.c
cpu.c cpu/SMT: Enable SMT only if a core is online 2024-09-23 12:43:03 -04:00
cpu_pm.c
crash_core.c Merge: Rebase kexec/kdump to upstream kernel v6.5 2024-05-27 13:52:25 +00:00
crash_dump.c
cred.c
delayacct.c delayacct: track delays from IRQ/SOFTIRQ 2024-07-15 11:12:08 -04:00
dma.c
exec_domain.c
exit.c exit: add internal include file with helpers 2024-07-02 09:45:34 -04:00
exit.h exit: add internal include file with helpers 2024-07-02 09:45:34 -04:00
extable.c
fail_function.c
fork.c fork: lock VMAs of the parent process when forking 2024-07-16 09:30:22 -04:00
freezer.c Revert "Revert "Merge: cgroup: Backport upstream cgroup commits up to v6.8"" 2024-05-18 21:38:20 -04:00
gen_kheaders.sh
groups.c
hung_task.c Revert "Revert "Merge: cgroup: Backport upstream cgroup commits up to v6.8"" 2024-05-18 21:38:20 -04:00
iomem.c
irq_work.c
jump_label.c
kallsyms.c kallsyms: Fix kallsyms_selftest failure 2024-06-17 14:17:28 -04:00
kallsyms_internal.h kallsyms: Reduce the memory occupied by kallsyms_seqs_of_names[] 2024-06-17 14:17:21 -04:00
kallsyms_selftest.c kallsyms: Fix kallsyms_selftest failure 2024-06-17 14:17:28 -04:00
kallsyms_selftest.h kallsyms: Add self-test facility 2024-06-17 14:17:22 -04:00
kcmp.c
kcov.c
kexec.c kexec: introduce sysctl parameters kexec_load_limit_* 2024-05-15 10:32:32 +08:00
kexec_core.c kexec: enable kexec_crash_size to support two crash kernel regions 2024-05-15 10:32:32 +08:00
kexec_elf.c
kexec_file.c kexec: support purgatories with .text.hot sections 2024-05-15 10:32:32 +08:00
kexec_internal.h
kheaders.c
kprobes.c
ksyms_common.c kallsyms: make kallsyms_show_value() as generic function 2024-06-17 14:17:28 -04:00
ksysfs.c
kthread.c kunit: Handle test faults 2024-07-31 20:32:29 -06:00
latencytop.c
module_signature.c
notifier.c
nsproxy.c
padata.c crypto: pcrypt - Fix hungtask for PADATA_RESET 2024-05-29 13:20:49 +08:00
panic.c Revert "kernel/panic.c: Move the location of bust_spinlocks to prevent hanging." 2024-06-21 11:17:34 -04:00
params.c params: Introduce the param_unknown_fn type 2024-06-17 14:17:29 -04:00
pid.c memfd: replace ratcheting feature from vm.memfd_noexec with hierarchy 2024-06-10 12:10:40 -04:00
pid_namespace.c memfd: replace ratcheting feature from vm.memfd_noexec with hierarchy 2024-06-10 12:10:40 -04:00
pid_sysctl.h memfd: replace ratcheting feature from vm.memfd_noexec with hierarchy 2024-06-10 12:10:40 -04:00
platform-feature.c
profile.c
ptrace.c ptrace: Convert ptrace_attach() to use lock guards 2024-07-08 20:54:16 +02:00
range.c
reboot.c kernel/reboot: Add SYS_OFF_MODE_RESTART_PREPARE mode 2024-08-22 11:21:33 -04:00
regset.c
relay.c
resource.c dax/kmem: Fix leak of memory-hotplug resources 2024-07-26 14:42:28 -04:00
resource_kunit.c
rh_messages.c
rh_messages.h redhat: deprecate bnx2xx drivers in rhel-9.5 2024-06-21 11:38:19 -04:00
rh_shadowman.c
rseq.c
scftorture.c
scs.c
seccomp.c
signal.c mm: suppress mm fault logging if fatal signal already pending 2024-09-05 20:37:05 -04:00
smp.c trace,smp: Add tracepoints for scheduling remotelly called functions 2024-06-17 12:58:33 -03:00
smpboot.c
smpboot.h
softirq.c Merge: workqueue: Backport workqueue commits to v6.9 2024-06-13 13:07:43 +00:00
stackleak.c
stacktrace.c
static_call.c
static_call_inline.c
stop_machine.c
sys.c mm: add a NO_INHERIT flag to the PR_SET_MDWE prctl 2024-04-30 17:51:24 -06:00
sys_ni.c x86/shstk: Introduce map_shadow_stack syscall 2024-10-01 11:17:15 -04:00
sysctl-test.c
sysctl.c mm: hugetlb: move hugeltb sysctls to its own file 2024-07-16 09:29:59 -04:00
task_work.c task_work: Introduce task_work_cancel() again 2024-09-04 14:25:56 +02:00
taskstats.c
test_kprobes.c
torture.c torture: Print out torture module parameters 2024-05-22 19:52:15 -04:00
tracepoint.c
tsacct.c
ucount.c
uid16.c
uid16.h
umh.c Revert "Revert "Merge: cgroup: Backport upstream cgroup commits up to v6.8"" 2024-05-18 21:38:20 -04:00
up.c
user-return-notifier.c
user.c
user_namespace.c
usermode_driver.c
utsname.c
utsname_sysctl.c
watch_queue.c kernel: watch_queue: copy user-array safely 2024-05-23 05:14:57 -04:00
watchdog.c Revert "printk: Bring back the RT bits." 2024-05-09 11:24:08 -04:00
watchdog_hld.c Revert "printk: Bring back the RT bits." 2024-05-09 11:24:08 -04:00
workqueue.c workqueue: Always queue work items to the newest PWQ for order workqueues 2024-07-17 09:55:18 -04:00
workqueue_internal.h workqueue: Drop the special locking rule for worker->flags and worker_pool->flags 2024-05-03 13:39:25 -04:00