linux-kernelorg-stable/net
Vadim Fedorenko 6e17474aa9 net: fib: restore ECMP balance from loopback
Preference of nexthop with source address broke ECMP for packets with
source addresses which are not in the broadcast domain, but rather added
to loopback/dummy interfaces. Original behaviour was to balance over
nexthops while now it uses the latest nexthop from the group. To fix the
issue introduce next hop scoring system where next hops with source
address equal to requested will always have higher priority.

For the case with 198.51.100.1/32 assigned to dummy0 and routed using
192.0.2.0/24 and 203.0.113.0/24 networks:

2: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether d6:54:8a:ff:78:f5 brd ff:ff:ff:ff:ff:ff
    inet 198.51.100.1/32 scope global dummy0
       valid_lft forever preferred_lft forever
7: veth1@if6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 06:ed:98:87:6d:8a brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.0.2.2/24 scope global veth1
       valid_lft forever preferred_lft forever
    inet6 fe80::4ed:98ff:fe87:6d8a/64 scope link proto kernel_ll
       valid_lft forever preferred_lft forever
9: veth3@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether ae:75:23:38:a0:d2 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 203.0.113.2/24 scope global veth3
       valid_lft forever preferred_lft forever
    inet6 fe80::ac75:23ff:fe38:a0d2/64 scope link proto kernel_ll
       valid_lft forever preferred_lft forever

~ ip ro list:
default
	nexthop via 192.0.2.1 dev veth1 weight 1
	nexthop via 203.0.113.1 dev veth3 weight 1
192.0.2.0/24 dev veth1 proto kernel scope link src 192.0.2.2
203.0.113.0/24 dev veth3 proto kernel scope link src 203.0.113.2

before:
   for i in {1..255} ; do ip ro get 10.0.0.$i; done | grep veth | awk ' {print $(NF-2)}' | sort | uniq -c:
    255 veth3

after:
   for i in {1..255} ; do ip ro get 10.0.0.$i; done | grep veth | awk ' {print $(NF-2)}' | sort | uniq -c:
    122 veth1
    133 veth3

Fixes: 32607a332c ("ipv4: prefer multipath nexthop that matches source address")
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20251221192639.3911901-1-vadim.fedorenko@linux.dev
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-12-30 11:07:38 +01:00
..
6lowpan
9p - fix a bug with O_APPEND in cached mode causing data to be written multiple times on server 2025-12-07 08:29:09 -08:00
802
8021q
appletalk
atm
ax25
batman-adv
bluetooth Bluetooth: MGMT: report BIS capability flags in supported settings 2025-12-19 17:11:27 -05:00
bpf
bridge net: bridge: Describe @tunnel_hash member in net_bridge_vlan_group struct 2025-12-28 10:17:14 +01:00
caif caif: fix integer underflow in cffrml_receive() 2025-12-11 01:35:41 -08:00
can can: j1939: make j1939_sk_bind() fail if device is no longer registered 2025-12-17 10:47:33 +01:00
ceph We have a patch that adds an initial set of tracepoints to the MDS 2025-12-14 15:24:10 +12:00
core net: avoid prefetching NULL pointers 2025-12-28 10:19:11 +01:00
dcb
devlink
dns_resolver
dsa net: dsa: fix missing put_device() in dsa_tree_find_first_conduit() 2025-12-23 10:32:08 +01:00
ethernet
ethtool ethtool: Avoid overflowing userspace buffer on stats query 2025-12-18 12:24:25 +01:00
handshake net/handshake: Fix null-ptr-deref in handshake_complete() 2025-12-22 12:36:40 +01:00
hsr net/hsr: fix NULL pointer dereference in prp_get_untagged_frame() 2025-12-04 11:15:13 +01:00
ieee802154
ife
ipv4 net: fib: restore ECMP balance from loopback 2025-12-30 11:07:38 +01:00
ipv6 ipv6: BUG() in pskb_expand_head() as part of calipso_skbuff_setattr() 2025-12-29 19:36:45 +01:00
iucv net: Remove KMSG_COMPONENT macro 2025-11-28 19:20:27 -08:00
kcm Networking changes for 6.19. 2025-12-03 17:24:33 -08:00
key
l2tp l2tp: correct debugfs label for tunnel tx stats 2025-12-01 12:03:09 -08:00
l3mdev
lapb
llc
mac80211 wifi: mac80211: ocb: skip rx_no_sta when interface is not joined 2025-12-16 10:33:14 +01:00
mac802154
mctp net: mctp: test: move TX packetqueue from dst to dev 2025-12-01 13:52:13 -08:00
mpls
mptcp mptcp: ensure context reset on disconnect() 2025-12-23 09:12:25 +01:00
ncsi
netfilter netfilter: nf_tables: avoid softlockup warnings in nft_chain_validate 2025-12-15 15:04:04 +01:00
netlabel
netlink
netrom netrom: Fix memory leak in nr_sendmsg() 2025-12-04 11:01:17 +01:00
nfc net: nfc: fix deadlock between nfc_unregister_device and rfkill_fop_write 2025-12-28 09:15:42 +01:00
nsh
openvswitch net: openvswitch: Avoid needlessly taking the RTNL on vport destroy 2025-12-22 12:25:11 +01:00
packet
phonet
psample
psp
qrtr
rds
rfkill
rose
rxrpc
sched net/sched: act_mirred: fix loop detection 2025-12-18 16:42:18 +01:00
sctp sctp: Clear inet_opt in sctp_v6_copy_ip_options(). 2025-12-18 16:18:00 +01:00
shaper
smc net: smc: SMC_HS_CTRL_BPF should depend on BPF_JIT 2025-12-04 11:07:18 -08:00
strparser
sunrpc NFS client updates for Linux 6.19 2025-12-12 21:52:42 +12:00
switchdev
tipc
tls
unix af_unix: don't post cmsg for SO_INQ unless explicitly asked for 2025-12-28 16:11:22 +01:00
vmw_vsock
wireless wifi: cfg80211: sme: store capped length in __cfg80211_connect_result() 2025-12-16 10:22:51 +01:00
x25
xdp
xfrm
Kconfig
Kconfig.debug
Makefile
compat.c
devres.c
socket.c vfs-6.19-rc1.fixes 2025-12-05 15:52:30 -08:00
sysctl_net.c