Introduction
TLDR; destination-IP NAT that redirects to RKE2’s ingress input chain
I recently had to make my RKE2 cluster cohabitate with an in-house service running on the same machines, and also exposing a website through port 80.
The obvious problem here is: RKE2 clusters (most of the time) also need to listen on port 80 in order to make our RKE2-hosted websites work. So, how do we share port 80 between RKE2, and another application ?
First of all, some information about my specific situation:
Both the in-house service and the RKE2 cluster are using a VIP for HA (RKE2 though my helm chart), and the only entrypoint to the cluster is its VIP. This means that traffic coming to RKE2 and traffic coming to the other application will have different destination IPs (X and Y), which we can use to route traffic.
But what kind of routing do we want to achieve ? One solution would be to redirect traffic to ip X port 80 to port 81, while letting traffic from all other IPs (including Y) go to port 80. And then configure serice X to listen on port 81 instead of 80.
This solution would probably work, but I wasn’t interested in it because I wanted both applications to have the illusion of owning port 80. So, I achieved another way to route, by directly redirecting RKE2’s ingress traffic to its internal ingress input chain.
Routing on destination IP with nftables
The configuration
To do that, we are gojng to use nftables.
nftables is the modern iptables replacement, and is the backend that firewalls such as firewalld use. We are going to configure it through a file, that will be at /etc/nftables.conf.
In normal cases, you can configure nftables to have a complete control of the rules by adding flush ruleset to flush all rules before adding new ones when reloading the firewall. But since our machine is running RKE2, which also edit the firewall state, we will have cohabit with it and only edit our own rules.
So what does the nftables configuration look like ? Here it is:
#!/usr/sbin/nft -f
table ip nat
# Re-create our chains
chain ip nat k8s_ingress_forward_prerouting
chain ip nat k8s_ingress_forward_output
delete chain ip nat k8s_ingress_forward_prerouting
delete chain ip nat k8s_ingress_forward_output
table ip nat {
chain k8s_ingress_forward_prerouting {
# Does NOT match traffic coming from localhost
type nat hook prerouting priority -199;
ip daddr RKE2_VIP tcp dport 80 counter jump CNI-HOSTPORT-DNAT;
ip daddr RKE2_VIP tcp dport 443 counter jump CNI-HOSTPORT-DNAT;
}
chain k8s_ingress_forward_output {
# ONLY match traffic coming from localhost
type nat hook output priority -199;
ip daddr RKE2_VIP tcp dport 80 counter jump CNI-HOSTPORT-DNAT;
ip daddr RKE2_VIP tcp dport 443 counter jump CNI-HOSTPORT-DNAT;
}
}
You will have to replace RKE2_VIP with your RKE2 VIP (the destination IP to be used to communicate with your RKE2 ingress).
This configuration will add our chains k8s_ingress_forward_prerouting and k8s_ingress_forward_output, and not touch any other chains. These chains will redirect traffic matching from RKE2_VIP to the CNI-HOSTPORT-DNAT chain in the same table, which is an internal RKE2 chain. This means this configuration can break anytime. It was tested and functional as of November 2025 on RKE2 version v1.33.5+rke2r1.
Enabling the service THE RIGHT WAY
Before enabling the service, you will need to remove its ExecStop directive. This will allow you to not overwrite RKE2 chains (by flushing them) when stopping the service.
You can do that by creating the override /etc/systemd/system/nftables.service.d/override.conf with the following content:
[Service]
ExecStop=
and reloading system with systemctl daemon-reload
Then, you can enable it with systemctl enable --now nftables
Re-applying the rules when RKE2 restarts
For some reason, the rules need to be re-applied when RKE2 restarts/starts at boot.
To do that, you can create /etc/systemd/system/rke2-server.service.d/override.conf with the content:
[Service]
ExecStartPost=systemctl restart nftables-append