Go to file
Kumar Kartikeya Dwivedi 164c246571 rqspinlock: Protect waiters in queue from stalls
Implement the wait queue cleanup algorithm for rqspinlock. There are
three forms of waiters in the original queued spin lock algorithm. The
first is the waiter which acquires the pending bit and spins on the lock
word without forming a wait queue. The second is the head waiter that is
the first waiter heading the wait queue. The third form is of all the
non-head waiters queued behind the head, waiting to be signalled through
their MCS node to overtake the responsibility of the head.

In this commit, we are concerned with the second and third kind. First,
we augment the waiting loop of the head of the wait queue with a
timeout. When this timeout happens, all waiters part of the wait queue
will abort their lock acquisition attempts. This happens in three steps.
First, the head breaks out of its loop waiting for pending and locked
bits to turn to 0, and non-head waiters break out of their MCS node spin
(more on that later). Next, every waiter (head or non-head) attempts to
check whether they are also the tail waiter, in such a case they attempt
to zero out the tail word and allow a new queue to be built up for this
lock. If they succeed, they have no one to signal next in the queue to
stop spinning. Otherwise, they signal the MCS node of the next waiter to
break out of its spin and try resetting the tail word back to 0. This
goes on until the tail waiter is found. In case of races, the new tail
will be responsible for performing the same task, as the old tail will
then fail to reset the tail word and wait for its next pointer to be
updated before it signals the new tail to do the same.

We terminate the whole wait queue because of two main reasons. Firstly,
we eschew per-waiter timeouts with one applied at the head of the wait
queue.  This allows everyone to break out faster once we've seen the
owner / pending waiter not responding for the timeout duration from the
head.  Secondly, it avoids complicated synchronization, because when not
leaving in FIFO order, prev's next pointer needs to be fixed up etc.

Lastly, all of these waiters release the rqnode and return to the
caller. This patch underscores the point that rqspinlock's timeout does
not apply to each waiter individually, and cannot be relied upon as an
upper bound. It is possible for the rqspinlock waiters to return early
from a failed lock acquisition attempt as soon as stalls are detected.

The head waiter cannot directly WRITE_ONCE the tail to zero, as it may
race with a concurrent xchg and a non-head waiter linking its MCS node
to the head's MCS node through 'prev->next' assignment.

One notable thing is that we must use RES_DEF_TIMEOUT * 2 as our maximum
duration for the waiting loop (for the wait queue head), since we may
have both the owner and pending bit waiter ahead of us, and in the worst
case, need to span their maximum permitted critical section lengths.

Reviewed-by: Barret Rhoden <brho@google.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20250316040541.108729-11-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-19 08:03:05 -07:00
Documentation bpf, docs: Fix broken link to renamed bpf_iter_task_vmas.c 2025-03-15 11:48:56 -07:00
LICENSES
arch rqspinlock: Hardcode cond_acquire loops for arm64 2025-03-19 08:03:04 -07:00
block block-6.14-20250214 2025-02-14 11:40:59 -08:00
certs
crypto treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
drivers Smaller than usual with no fixes from any subtree. 2025-02-20 10:19:54 -08:00
fs Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf bpf-6.14-rc4 2025-02-20 18:13:57 -08:00
include rqspinlock: Protect pending bit owners from stalls 2025-03-19 08:03:05 -07:00
init Kbuild updates for v6.14 2025-01-31 12:07:07 -08:00
io_uring io_uring-6.14-20250214 2025-02-14 11:30:53 -08:00
ipc treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
kernel rqspinlock: Protect waiters in queue from stalls 2025-03-19 08:03:05 -07:00
lib test_xarray: fix failure in check_pause when CONFIG_XARRAY_MULTI is not defined 2025-02-17 22:40:04 -08:00
mm Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf bpf-6.14-rc4 2025-02-20 18:13:57 -08:00
net net: filter: Avoid shadowing variable in bpf_convert_ctx_access() 2025-03-15 11:48:27 -07:00
rust Driver core api addition for 6.14-rc3 2025-02-16 12:54:42 -08:00
samples Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf bpf-6.14-rc4 2025-02-20 18:13:57 -08:00
scripts kbuild, bpf: Correct pahole version that supports distilled base btf feature 2025-02-24 14:14:52 -08:00
security security: Propagate caller information in bpf hooks 2025-03-15 11:48:58 -07:00
sound ALSA: seq: Drop UMP events when no UMP-conversion is set 2025-02-17 18:02:02 +01:00
tools bpftool: Using the right format specifiers 2025-03-17 13:50:56 -07:00
usr
virt KVM: remove kvm_arch_post_init_vm 2025-02-04 11:27:45 -05:00
.clang-format
.clippy.toml
.cocciconfig
.editorconfig
.get_maintainer.ignore
.gitattributes
.gitignore
.mailmap mailmap: update Nick's entry 2025-02-17 22:40:03 -08:00
.rustfmt.toml
COPYING
CREDITS MAINTAINERS: Move Pavel to kernel.org address 2025-02-07 09:12:33 -08:00
Kbuild
Kconfig
MAINTAINERS Smaller than usual with no fixes from any subtree. 2025-02-20 10:19:54 -08:00
Makefile Linux 6.14-rc3 2025-02-16 14:02:44 -08:00
README

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the reStructuredText markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.