linux-kernelorg-stable/net
Florian Westphal a60f7bf4a1 netfilter: nft_set_rbtree: continue traversal if element is inactive
When the rbtree lookup function finds a match in the rbtree, it sets the
range start interval to a potentially inactive element.

Then, after tree lookup, if the matching element is inactive, it returns
NULL and suppresses a matching result.

This is wrong and leads to false negative matches when a transaction has
already entered the commit phase.

cpu0					cpu1
  has added new elements to clone
  has marked elements as being
  inactive in new generation
					perform lookup in the set
  enters commit phase:
I) increments the genbit
					A) observes new genbit
					B) finds matching range
					C) returns no match: found
					range invalid in new generation
II) removes old elements from the tree
					C New nft_lookup happening now
				       	  will find matching element,
					  because it is no longer
					  obscured by old, inactive one.

Consider a packet matching range r1-r2:

cpu0 processes following transaction:
1. remove r1-r2
2. add r1-r3

P is contained in both ranges. Therefore, cpu1 should always find a match
for P.  Due to above race, this is not the case:

cpu1 does find r1-r2, but then ignores it due to the genbit indicating
the range has been removed.  It does NOT test for further matches.

The situation persists for all lookups until after cpu0 hits II) after
which r1-r3 range start node is tested for the first time.

Move the "interval start is valid" check ahead so that tree traversal
continues if the starting interval is not valid in this generation.

Thanks to Stefan Hanreich for providing an initial reproducer for this
bug.

Reported-by: Stefan Hanreich <s.hanreich@proxmox.com>
Fixes: c1eda3c639 ("netfilter: nft_rbtree: ignore inactive matching element with no descendants")
Signed-off-by: Florian Westphal <fw@strlen.de>
2025-09-10 20:30:37 +02:00
..
6lowpan
9p
802
8021q
appletalk
atm net: atm: fix memory leak in atm_register_sysfs when device_register fail 2025-09-04 09:53:44 +02:00
ax25 ax25: properly unshare skbs in ax25_kiss_rcv() 2025-09-03 17:06:30 -07:00
batman-adv
bluetooth
bpf
bridge net: bridge: Bounce invalid boolopts 2025-09-08 18:23:40 -07:00
caif
can
ceph
core net: dev_ioctl: take ops lock in hwtstamp lower paths 2025-09-09 18:13:36 -07:00
dcb
devlink
dns_resolver
dsa
ethernet
ethtool
handshake
hsr
ieee802154
ife
ipv4 tunnels: reset the GSO metadata before reusing the skb 2025-09-09 13:03:33 +02:00
ipv6 ipv6: annotate data-races around devconf->rpl_seg_enabled 2025-09-02 17:01:06 -07:00
iucv
kcm
key
l2tp
l3mdev
lapb
llc
mac80211
mac802154
mctp mctp: return -ENOPROTOOPT for unknown getsockopt options 2025-09-03 17:01:52 -07:00
mpls
mptcp mptcp: sockopt: make sync_socket_options propagate SOCK_KEEPOPEN 2025-09-09 18:38:06 -07:00
ncsi
netfilter netfilter: nft_set_rbtree: continue traversal if element is inactive 2025-09-10 20:30:37 +02:00
netlabel
netlink genetlink: fix genl_bind() invoking bind() after -EPERM 2025-09-08 17:50:36 -07:00
netrom
nfc
nsh
openvswitch
packet
phonet
psample
qrtr
rds
rfkill
rose
rxrpc
sched
sctp
shaper
smc net/smc: Remove validation of reserved bits in CLC Decline message 2025-09-03 17:01:07 -07:00
strparser
sunrpc
switchdev
tipc
tls
unix
vmw_vsock
wireless wifi: cfg80211: sme: cap SSID length in __cfg80211_connect_result() 2025-09-03 09:37:55 +02:00
x25
xdp
xfrm
Kconfig
Kconfig.debug
Makefile
compat.c
devres.c
socket.c
sysctl_net.c