Skip to content

flannel cross node traffic does not work with latest systemd 242 due to a race #1155

@mcastelino

Description

@mcastelino

Expected Behavior

Cross node pod traffic should work, node to pod traffic should work across nodes.

Current Behavior

When running flannel with systemd 242+ there seems to be a race condition between flannel programming the mac address of the flannel.1 interface and systemd programming the mac address on the virtual interface. This results in all cross node traffic being dropped at layer 2 on the destination node due to incorrect destination vtep mac.

With systemd 242 the default policy is setup to be MACAddressPolicy=persistent

/usr/lib/systemd/network/99-default.link

[Link]
NamePolicy=keep kernel database onboard slot path
MACAddressPolicy=persistent

When flannel brings up the interface it programs the mac address and systemd then reprograms it again.

In the trace below you will see

clear@clr-02 ~ $ ip addr show flannel.1
4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
    link/ether d6:02:e3:df:ea:7a brd ff:ff:ff:ff:ff:ff
    inet 10.244.1.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever

But the arp tables on remote nodes are setup with a different mac address d6:02:e3:df:ea:7a vs 5e:89:db:49:c6:a4

clear@clr-01 ~ $ ip neigh
10.244.1.0 dev flannel.1 lladdr 5e:89:db:49:c6:a4 PERMANENT

Looking at the netlink traces you see the mac address being changed twice, the first time by flannel and the second time to a different address by systemd based on its default policy

clear@clr-02 ~ $ sudo ip monitor all
[NETCONF]inet flannel.1 forwarding on rp_filter off mc_forwarding off proxy_neigh off ignore_routes_with_linkdown off
[NETCONF]inet6 flannel.1 forwarding off proxy_neigh off ignore_routes_with_linkdown off
[LINK]4: flannel.1: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN group default
    link/ether 5e:89:db:49:c6:a4 brd ff:ff:ff:ff:ff:ff
[LINK]4: flannel.1: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN group default
    link/ether d6:02:e3:df:ea:7a brd ff:ff:ff:ff:ff:ff
[ADDR]4: flannel.1    inet 10.244.1.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever

Possible Solution

  • The user can setup a specific mac address policy of MACAddressPolicy=none on the flannel* interface on each system which hides the issue, but requires node level changes
    or
  • Flannel can watch for mac address changes on the link and reprogram

Steps to Reproduce (for bugs)

  1. kubeadm init
  2. kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/62e44c867a2846fefb68bd5f178daf4da3095ccb/Documentation/kube-flannel.yml
  3. Ping pod on remote node

Context

Flannel and any flannel based network plugins stop working with systemd 242 (Canal).
This will impact other distributions when they upgrade to systemd 242 and beyond.

Your Environment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions