Go to file
Willem de Bruijn 32607a332c ipv4: prefer multipath nexthop that matches source address
With multipath routes, try to ensure that packets leave on the device
that is associated with the source address.

Avoid the following tcpdump example:

    veth0 Out IP 10.1.0.2.38640 > 10.2.0.3.8000: Flags [S]
    veth1 Out IP 10.1.0.2.38648 > 10.2.0.3.8000: Flags [S]

Which can happen easily with the most straightforward setup:

    ip addr add 10.0.0.1/24 dev veth0
    ip addr add 10.1.0.1/24 dev veth1

    ip route add 10.2.0.3 nexthop via 10.0.0.2 dev veth0 \
    			  nexthop via 10.1.0.2 dev veth1

This is apparently considered WAI, based on the comment in
ip_route_output_key_hash_rcu:

    * 2. Moreover, we are allowed to send packets with saddr
    *    of another iface. --ANK

It may be ok for some uses of multipath, but not all. For instance,
when using two ISPs, a router may drop packets with unknown source.

The behavior occurs because tcp_v4_connect makes three route
lookups when establishing a connection:

1. ip_route_connect calls to select a source address, with saddr zero.
2. ip_route_connect calls again now that saddr and daddr are known.
3. ip_route_newports calls again after a source port is also chosen.

With a route with multiple nexthops, each lookup may make a different
choice depending on available entropy to fib_select_multipath. So it
is possible for 1 to select the saddr from the first entry, but 3 to
select the second entry. Leading to the above situation.

Address this by preferring a match that matches the flowi4 saddr. This
will make 2 and 3 make the same choice as 1. Continue to update the
backup choice until a choice that matches saddr is found.

Do this in fib_select_multipath itself, rather than passing an fl4_oif
constraint, to avoid changing non-multipath route selection. Commit
e6b45241c5 ("ipv4: reset flowi parameters on route connect") shows
how that may cause regressions.

Also read ipv4.sysctl_fib_multipath_use_neigh only once. No need to
refresh in the loop.

This does not happen in IPv6, which performs only one lookup.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250424143549.669426-2-willemdebruijn.kernel@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-29 16:22:25 +02:00
Documentation net: ti: icssg-prueth: Add ICSSG FW Stats 2025-04-28 17:20:53 -07:00
LICENSES LICENSES: add 0BSD license text 2024-09-01 20:43:24 -07:00
arch Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-04-24 11:20:52 -07:00
block vfs-6.15-rc3.fixes.2 2025-04-19 14:31:08 -07:00
certs sign-file,extract-cert: use pkcs11 provider for OPENSSL MAJOR >= 3 2024-09-20 19:52:48 +03:00
crypto crypto: scomp - Fix off-by-one bug when calculating last page 2025-04-23 09:32:57 +08:00
drivers net: ti: icssg-prueth: Add ICSSG FW Stats 2025-04-28 17:20:53 -07:00
fs Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-04-24 11:20:52 -07:00
include ipv4: prefer multipath nexthop that matches source address 2025-04-29 16:22:25 +02:00
init Kconfig: switch CONFIG_SYSFS_SYCALL default to n 2025-04-15 10:28:35 +02:00
io_uring io_uring/zcrx: fix late dma unmap for a dead dev 2025-04-18 06:12:10 -06:00
ipc treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
kernel Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-04-24 11:20:52 -07:00
lib hardening fixes for v6.15-rc3 2025-04-18 13:20:20 -07:00
mm Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-04-24 11:20:52 -07:00
net ipv4: prefer multipath nexthop that matches source address 2025-04-29 16:22:25 +02:00
rust rust: helpers: Add dma_alloc_attrs() and dma_free_attrs() 2025-04-15 23:06:03 +02:00
samples Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-04-17 12:26:50 -07:00
scripts Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-04-24 11:20:52 -07:00
security Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-04-24 11:20:52 -07:00
sound ASoC: Fixes for v6.15 2025-04-11 15:51:19 +02:00
tools tools/Makefile: Add ynl target 2025-04-28 17:18:48 -07:00
usr kbuild: hdrcheck: fix cross build with clang 2025-03-05 04:06:45 +09:00
virt ARM: 2025-04-08 13:47:55 -07:00
.clang-format clang-format: Update the ForEachMacros list for v6.15-rc1 2025-04-13 11:03:59 +02:00
.clippy.toml rust: give Clippy the minimum supported Rust version 2025-01-10 00:17:25 +01:00
.cocciconfig
.editorconfig
.get_maintainer.ignore MAINTAINERS: Retire Ralf Baechle 2024-11-12 15:48:59 +01:00
.gitattributes
.gitignore kbuild: Create intermediate vmlinux build with relocations preserved 2025-03-17 00:29:50 +09:00
.mailmap sound fixes for 6.15-rc3 2025-04-17 10:14:51 -07:00
.rustfmt.toml
COPYING
CREDITS MAINTAINERS: update SLAB ALLOCATOR maintainers 2025-04-17 20:10:06 -07:00
Kbuild drm: ensure drm headers are self-contained and pass kernel-doc 2025-02-12 10:44:43 +02:00
Kconfig io_uring: Rename KConfig to Kconfig 2025-02-19 14:53:27 -07:00
MAINTAINERS Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-04-24 11:20:52 -07:00
Makefile Fix mis-uses of 'cc-option' for warning disablement 2025-04-23 10:08:29 -07:00
README

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the reStructuredText markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.