Skip to content

Conversation

@robmry
Copy link
Contributor

@robmry robmry commented Mar 10, 2025

- What I did

For kernels that don't have CONFIG_IP_NF_RAW, if the env var DOCKER_INSECURE_NO_IPTABLES_RAW is set to "1", don't try to create raw rules.

Warning: When the environment variable is set, direct routing to published ports is possible from other hosts on the local network, even if the port is published to a loopback address. It un-does some of the security hardening described at https://www.docker.com/blog/docker-engine-28-hardening-container-networking-by-default/

- How I did it

The env var is DOCKER_INSECURE_NO_IPTABLES_RAW ... because that's why the workaround is needed. Alternatively, it could be called something like DOCKER_INSECURE_ALLOW_DIRECT_ROUTING ... because that's currently the effect. Then it'd need to do the same thing for an nftables/firewalld implementation.

If we want to add a more "feature-y" way to allow direct routing at some point - it should be via a new "gateway mode" with well defined semantics, something less drastic than nat-unprotected (allowing access from remote hosts, but only to published ports, and maybe not to ports published to 127.0.0.1). That'd work per network rather than globally, and it'd need some different regression testing and more documentation.

So, I went for this simple workaround for kernels without the required module.

- How to verify it

New integration test.

Also, checked a container with host port mapping started on a host without CONFIG_IP_NF_RAW ...

# cat /etc/systemd/system/docker.service.d/insecure_direct_routing.conf
[Service]
Environment="DOCKER_INSECURE_NO_IPTABLES_RAW=1"
# systemctl daemon-reload
# systemctl restart docker
# docker run --rm -ti -p 127.0.0.1:8080:80 alpine

In docker's log ...

Mar 10 15:38:50 debian2 dockerd[7232]: time="2025-03-10T15:38:50.830162607Z" level=debug msg="DOCKER_INSECURE_NO_IPTABLES_RAW=1 - skipping raw rules" eid=3c070182713114fcf3b58f606cd0d2dc773516360c0bd9eed60bb4899d90d467 ep=elated_keldysh net=bridge nid=b30950846c3a84d5e17e8cf0406ff435a76323e87d41d3fdf47262032e670af6
Mar 10 15:38:50 debian2 dockerd[7232]: time="2025-03-10T15:38:50.830260399Z" level=debug msg="DOCKER_INSECURE_NO_IPTABLES_RAW=1 - skipping raw rules" eid=3c070182713114fcf3b58f606cd0d2dc773516360c0bd9eed60bb4899d90d467 ep=elated_keldysh net=bridge nid=b30950846c3a84d5e17e8cf0406ff435a76323e87d41d3fdf47262032e670af6

(And, no rules created in the iptables/ip6tables raw table.)

- Human readable description for the release notes

Add the environment variable `DOCKER_INSECURE_NO_IPTABLES_RAW=1` to allow Docker to run on systems where the Linux kernel can't provide `CONFIG_IP_NF_RAW` support. When enabled, Docker will not create rules in the iptables `raw` table. Warning: This is not recommended for production environments as it reduces security by allowing other hosts on the local network to route to ports published to host addresses, even when they are published to `127.0.0.1.` This option bypasses some of the security hardening introduced in Docker Engine 28.0.0.

@robmry robmry added kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny area/networking Networking impact/changelog area/networking/firewalling Networking area/networking/d/bridge Networking area/networking/portmapping Networking version/28.0 labels Mar 10, 2025
@robmry robmry added this to the 28.0.2 milestone Mar 10, 2025
@robmry robmry self-assigned this Mar 10, 2025
@thaJeztah
Copy link
Member

Perhaps something we should add to the warnings returned by docker info, so that it's visible to the user that these are disabled.

Not sure how easy it would be to punch through this option all the way from the daemon config, but if that's not trivial (likely isn't?) we could add the same check for the env-var somewhere in the daemon.systeminfo (or whatever a good place would be)

@robmry
Copy link
Contributor Author

robmry commented Mar 10, 2025

Perhaps something we should add to the warnings returned by docker info, so that it's visible to the user that these are disabled.

Sure, how does this look? ... could add some words about the implications, but that could get a bit too wordy?

# docker info
Client:
 Version:    28.0.1
 [...]
Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 [...]
 Experimental: false
 Insecure Registries:
  ::1/128
  127.0.0.0/8
 Live Restore Enabled: false

[DEPRECATION NOTICE]: API is accessible on http://0.0.0.0:2375 without encryption.
         Access to the remote API is equivalent to root access on the host. Refer
         to the 'Docker daemon attack surface' section in the documentation for
         more information: https://docs.docker.com/go/attack-surface/
In future versions this will be a hard failure preventing the daemon from starting! Learn more at: https://docs.docker.com/go/api-security/
WARNING: DOCKER_INSECURE_NO_IPTABLES_RAW=1

(Maybe worth noting, I think this will go-away once we've switched to native nftables.)

@robmry robmry marked this pull request as ready for review March 10, 2025 17:31
@robmry robmry requested review from akerouanton and thaJeztah March 10, 2025 17:32
@thaJeztah
Copy link
Member

Sure, how does this look? ... could add some words about the implications, but that could get a bit too wordy?

Probably fine; if we need more, we could of course add a https://docs.docker.com/go/.... link, but if it's temporary, and an explicit opt-in, it's probably fine to keep it basic. Unless you have a very brief description, but otherwise, perhaps just leave it as this.

(Having it in docker info can help us as well when triaging bug-reports if the user posts the output in the ticket)

}
// Env-var belonging to the bridge driver, disables use of the iptables "raw" table.
if os.Getenv("DOCKER_INSECURE_NO_IPTABLES_RAW") == "1" {
v.Warnings = append(v.Warnings, "WARNING: DOCKER_INSECURE_NO_IPTABLES_RAW=1")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do want to add some words around it, perhaps something like;

Suggested change
v.Warnings = append(v.Warnings, "WARNING: DOCKER_INSECURE_NO_IPTABLES_RAW=1")
v.Warnings = append(v.Warnings, "WARNING: DOCKER_INSECURE_NO_IPTABLES_RAW is set")
Suggested change
v.Warnings = append(v.Warnings, "WARNING: DOCKER_INSECURE_NO_IPTABLES_RAW=1")
v.Warnings = append(v.Warnings, "WARNING: raw iptables rules are skipped because DOCKER_INSECURE_NO_IPTABLES_RAW is set")

The 'DOCKER_INSECURE_NO_IPTABLES_RAW is set` may be slightly nicer to indicate what's the case, but not sure about the "raw iptable rules are skipped", not sure if that's providing much context on its own.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, yes - I'll change it to "is set".

"gotest.tools/v3/golden"

containertypes "github.com/docker/docker/api/types/container"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Micro-nit (if you'll be pushing again); looks like this one ended up separate from the other imports below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh! Thank you ... something in my GoLand setup does that, maybe gofumpt. It's very annoying, but haven't tracked it down yet.

Copy link
Member

@thaJeztah thaJeztah Mar 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, possibly it's using a different convention (which we possibly should adopt) where imports are in 3 blocks instead of 2;

[go stdlib imports]

[external imports]

[local imports]

That's kinda ok, but I see it also more often result in accidental splitting up blocks even more. I think there's a linter for that though (still to look into)

For kernels that don't have CONFIG_IP_NF_RAW, if the env
var DOCKER_INSECURE_NO_IPTABLES_RAW is set to "1", don't
try to create raw rules.

This means direct routing to published ports is possible
from other hosts on the local network, even if the port
is published to a loopback address.

Signed-off-by: Rob Murray <[email protected]>

func rawRulesDisabled(ctx context.Context) bool {
if os.Getenv("DOCKER_INSECURE_NO_IPTABLES_RAW") == "1" {
log.G(ctx).Debug("DOCKER_INSECURE_NO_IPTABLES_RAW=1 - skipping raw rules")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this logged once, or for every container? If it's once, then Warn would be fine as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every container, on create and delete ... I figured it'd be better to repeat it, so it's there in any log snippets we get.

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code LGTM

I'll leave it to @akerouanton for the networking side of things 🤗

@akerouanton akerouanton merged commit 4ff19b2 into moby:master Mar 13, 2025
152 checks passed
@vvoland
Copy link
Contributor

vvoland commented Mar 13, 2025

WDYT about changelog:

Add environment variable `DOCKER_INSECURE_NO_IPTABLES_RAW=1` to allow Docker to run on systems where the Linux kernel can't provide `CONFIG_IP_NF_RAW` support. When enabled, Docker will not create rules in the iptables `raw` table. Warning: This is not recommended for production environments as it reduces security by allowing other hosts on the local network to route to ports published to host addresses, even when they are published to `127.0.0.1.` This option bypasses some of the security hardening introduced in Docker Engine 28.0.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/networking/d/bridge Networking area/networking/firewalling Networking area/networking/portmapping Networking area/networking Networking impact/changelog kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny version/28.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Failed to set up container networking 28.0.1 (module IP_NF_RAW dependency)

4 participants