Since #116 implemented fragile comparisons of iptables rules to avoid
calling the iptables binary excessively during every reconciliation, the
iptables rules for IPIP encapsulation must be updated to match the
expected output. One complication is that rather than returning the
protocol number in the rule, iptables resolves the protocol number to a
name by looking up the number in the netd protocols database. This name
can vary depending on the host's environment. This commit adds two
solutions for resolving the protocol name:
1. a fixed mapping to the string `ipencap`, which should always work
for Kilo whenever it runs in the Alpine Linux container; and
2. a runtime lookup using the netd database, which only works if Kilo is
compiled with CGO and is meant to be used only if Kilo is not running in
the normal container environment.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
Currently Kilo incorrectly identifies the 172.16/12 private IP range.
This commit fixes the logic.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit introduces a new `--resync-period` flag to control how often
the Kilo controllers should reconcile.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit adds a logger to the iptables controller using the options
pattern. It also logs when the controller needs to reset rules, to be
able to identify costly reconciliations.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
Currently, every time the iptables controller syncs rules, it spawns an
an iptables process for every rule it checks. This causes two problems:
1. it creates unnecessary load on the system; and
2. it causes contention on the xtables lock file.
This commit creates a lazy cache for iptables rules and chains that
avoids spawning iptables processes. This means that each time the
iptables rules are reconciled, if no rules need to be changed then at
most one iptables process should be spawned to check all of the rules in
a chain and at most one process should be spawned to check all of the
chains in a table.
Note: the success of this reduction in calls to iptables depends on a
somewhat fragile comparison of iptables rule text. The text of any rule
must match exactly, including the order of the flags. An improvement to
come would be to implement an iptables rule parser than can be used to
check semantic equivalence betweem iptables rules.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
Because of new naming conventions for locations, the CIDRs were not
being set within locations.
This lead to no iptables rules added for nodes in the same location.
This commit fixes a bug where the variable holding the index of the
private interface was shadowed, causing it to always be "0".
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
Add default iptables to allow forward traffic from and to pod cidr.
Previously Kilo expected the default behaviour of the forward chain to
accept packets, which can not be guaranteed.
This commit changes the graph so that the WireGuard CIDR is used as the
title rather than the pod subnet assigned to a node in the cluster.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This ensures that Kilo will not select an IP assigned to the Kilo
interface when discovering public and private IPs.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit introduces a new Prometheus metric to detect if the node is
a leader of its location, from its own point of view.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
Previously, when `deleteFromIndex` exited early due to an error, nil
rules would be left in the controller's list of rules, which could
provoke a panic on the next reconciliation. This commit ensures that nil
rules are removed before an early exit.
Fixes: #51
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
Previously, when udpdating the persistent keepalive of a node via
annotations, the node's WireGuard configuration was not updated. This
corrects the behavior.
Fixes: #54
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit adds support for defining preshared keys when declaring a
new Peer CRD. This preshared key will be used whenever the nodes in the
Kilo mesh communicate with that peer.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit enables simultaneously managing IPv4 and IPv6 iptables
rules. This makes it possible to have peers with IPv6 allowed IPs in an
otherwise IPv4 stack and vice versa.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit enables NAT-ing packets outgoing to the WAN from both the
Pod subnet as well as from peers. This means that Pods can access the
Internet and that peers can use the Kilo mesh as a gateway to the
Internet.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit better organizes the location of iptables rules. This is
made possible by exposing two new funcs, `NewRule` and `NewChain`.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit re-enables old functionality, which permitted the generation
of the configuration for a cluster without any peers.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit changes how Kilo allows nodes and peers behind NAT to roam.
Rather that ignore changes to endpoints when comparing WireGuard
configurations, Kilo now incorporates changes to endpoints for peers
behind NAT into its configuration first and later compares the
configurations.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit documents the use of the persistent-keepalive annotation and
corrects the implementation of keepalives.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
Kilo had a routine that synchronized the endpoints of peers back into
the API to ensure that endpoints updated by WireGuard for a roaming peer
would always positively compare with the endpoints in the API. This is
no longer needed as Kilo will now simply ignore changes to endpoints for
peers with a non-zero persistent keepalive.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit enables Kilo to ignore changes to the endpoints of peers
that sit behind a NAT gateway. We use the heuristic of a non-zero
persistent keepalive to decide whether the endpoint field should be
ignored. This will allow NATed peers to roam and for every node in the
cluster to have a different value for a peer's endpoint, as is natural
when a peer's connections are NATed.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit fixes the issue encountered in #36, where the CNI config is
touched even though CNI management is disabled.
Fixes: #36
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>