This commit introduces a new `--resync-period` flag to control how often
the Kilo controllers should reconcile.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit adds a logger to the iptables controller using the options
pattern. It also logs when the controller needs to reset rules, to be
able to identify costly reconciliations.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
Currently, every time the iptables controller syncs rules, it spawns an
an iptables process for every rule it checks. This causes two problems:
1. it creates unnecessary load on the system; and
2. it causes contention on the xtables lock file.
This commit creates a lazy cache for iptables rules and chains that
avoids spawning iptables processes. This means that each time the
iptables rules are reconciled, if no rules need to be changed then at
most one iptables process should be spawned to check all of the rules in
a chain and at most one process should be spawned to check all of the
chains in a table.
Note: the success of this reduction in calls to iptables depends on a
somewhat fragile comparison of iptables rule text. The text of any rule
must match exactly, including the order of the flags. An improvement to
come would be to implement an iptables rule parser than can be used to
check semantic equivalence betweem iptables rules.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
Because of new naming conventions for locations, the CIDRs were not
being set within locations.
This lead to no iptables rules added for nodes in the same location.
This commit fixes a bug where the variable holding the index of the
private interface was shadowed, causing it to always be "0".
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
Add default iptables to allow forward traffic from and to pod cidr.
Previously Kilo expected the default behaviour of the forward chain to
accept packets, which can not be guaranteed.
This commit changes the graph so that the WireGuard CIDR is used as the
title rather than the pod subnet assigned to a node in the cluster.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This ensures that Kilo will not select an IP assigned to the Kilo
interface when discovering public and private IPs.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit introduces a new Prometheus metric to detect if the node is
a leader of its location, from its own point of view.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
Previously, when `deleteFromIndex` exited early due to an error, nil
rules would be left in the controller's list of rules, which could
provoke a panic on the next reconciliation. This commit ensures that nil
rules are removed before an early exit.
Fixes: #51
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
Previously, when udpdating the persistent keepalive of a node via
annotations, the node's WireGuard configuration was not updated. This
corrects the behavior.
Fixes: #54
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit adds support for defining preshared keys when declaring a
new Peer CRD. This preshared key will be used whenever the nodes in the
Kilo mesh communicate with that peer.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit enables simultaneously managing IPv4 and IPv6 iptables
rules. This makes it possible to have peers with IPv6 allowed IPs in an
otherwise IPv4 stack and vice versa.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit enables NAT-ing packets outgoing to the WAN from both the
Pod subnet as well as from peers. This means that Pods can access the
Internet and that peers can use the Kilo mesh as a gateway to the
Internet.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit better organizes the location of iptables rules. This is
made possible by exposing two new funcs, `NewRule` and `NewChain`.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit re-enables old functionality, which permitted the generation
of the configuration for a cluster without any peers.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit changes how Kilo allows nodes and peers behind NAT to roam.
Rather that ignore changes to endpoints when comparing WireGuard
configurations, Kilo now incorporates changes to endpoints for peers
behind NAT into its configuration first and later compares the
configurations.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit documents the use of the persistent-keepalive annotation and
corrects the implementation of keepalives.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
Kilo had a routine that synchronized the endpoints of peers back into
the API to ensure that endpoints updated by WireGuard for a roaming peer
would always positively compare with the endpoints in the API. This is
no longer needed as Kilo will now simply ignore changes to endpoints for
peers with a non-zero persistent keepalive.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit enables Kilo to ignore changes to the endpoints of peers
that sit behind a NAT gateway. We use the heuristic of a non-zero
persistent keepalive to decide whether the endpoint field should be
ignored. This will allow NATed peers to roam and for every node in the
cluster to have a different value for a peer's endpoint, as is natural
when a peer's connections are NATed.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit fixes the issue encountered in #36, where the CNI config is
touched even though CNI management is disabled.
Fixes: #36
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit allows DNS names to be used when specifying the endpoint
for a node in the WireGuard mesh. This is useful in many scenarios, in
particular when operating an IoT device whose public IP is dynamic. This
change allows the administrator to use a dynamic DNS name in the node's
endpoint.
One of the side-effects of this change is that the WireGuard port can
now be specified individually for each node in the mesh, if the
administrator wishes to do so.
*Note*: this commit introduces a breaking change; the
`force-external-ip` node annotation has been removed; its functionality
has been ported over to the `force-endpoint` annotation. This annotation
is documented in the annotations.md file. The expected content of this
annotation is no longer a CIDR but rather a host:port. The host can be
either a DNS name or an IP.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit entirely replaces NAT in Kilo with a few iproute2 rules.
Previously, Kilo would source-NAT the majority of packets in order to
avoid problems with strict source checks in cloud providers causing
packets to be considered martians. This source-NAT-ing made it
difficult to correctly apply Kuberenetes NetworkPolicies based on source
IPs.
This rewrite instead relies on a handful of iproute2 rules to ensure
that packets get encapsulated in certain scenarios based on the source
network and/or source interface.
This has the benefit of avoiding extra iptables bloat as well as
enabling better compatibility with NetworkPolicies.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit cleans up the iptables package to allow other packages to
create rules.
This commit also removes all NAT from Kilo.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit fixes the ip allocator `newAllocator` to produce IP
addresses with the original network mask. This is makes more sense. The
original functionality can be reproduced by wrapping the produced IP
address with the `oneAddressCIDR` helper.
Signed-off-by: Lucas Servén Marín <lserven@gmail.com>
This commit updates the well-known label to determine the region of the
node to topology.kubernetes.io/region, which is the new standard as
defined by the Kubernetes documentation, now that
failure-domain.beta.kubernetes.io/region has been deprecated.
This commit takes a big step towards ensuring that iptables rules are
always kept in the correct order. Specifically, when re-setting a a
ruleset, any time a rule is missing, that rule and all following rules
are re-added to ensure that from that index onwards all rules are in the
right order. Similarly, when reconciling an existing ruleset against the
backend, if a rule is missing, that rule an all following rules are
re-added.
This change does not guarantee that the order of rules in the backend
is correct. Unless an actor is modifying the order of rules in iptables,
all rules created by Kilo should now be kept in the correct order.
Fixes: #19
This commit makes it possible to specify the Kilo interface name. If the
specified interface exists, it will be used; if it does not exist, Kilo
will create it. If the interface already existed, then it will not be
deleted on shutdown; otherwise Kilo will destroy the interface.
Fixes: https://github.com/squat/kilo/issues/8
Addresses: 1/2 of https://github.com/squat/kilo/issues/17
If the hostname fails to resolve, this should not be considered a
blocking error. Most likely, it means that the hostname is simply not
resolvable, which should not be a requirement to run Kilo. In this case,
simply try to find a valid IP from other sources.
This commit adds basic support to run in compatibility mode with
Flannel. This allows clusters running Flannel as their principal
networking solution to leverage some advances Kilo features. In certain
Flannel setups, the clusters can even leverage muti-cloud. For this, the
cluster needs to either run in a full mesh, or Flannel needs to use the
API server's external IP address.
Add an exception to the route generation rules for when the external IP
of a node equals the internal IP. In this case, we cannot route traffic
through a tunnel.
This commit ensures that the WireGuard private key is re-used between
container restarts. The result of this is that external peers can keep
using their configuration and don't need to be re-configured just
because the Kilo container restarted.
We need to defensively deduplicate peer allowed IPs.
If two peers claim the same IP, the WireGuard configuration
could flap, causing the interface to churn.
This commit adds several output options to the `showconf` command of the
`kgctl` binary:
* `--as-peer`: this can be used to generate a peer configuration, which
can be used to configure the selected resource as a peer of another
WireGuard interface
* `--output`: this can be used to select the desired output format of
the peer resource, available options are: WireGuard, YAML, and JSON.
When interfaces on the host churn, the kernel will remove routes
associated with those interfaces. This could cause the Kilo route
controller to become out of sync with the routes that really exist. This
commit fixes this behavior.
This commit enables Kilo to work as an independent networking provider.
This is done by leveraging CNI. Kilo brings the necessary CNI plugins to
operate and takes care of all networking.
Add-on compatibility for Calico, Flannel, etc, will be re-introduced
shortly.
This commit exposes a new Prometheus to track the number of
reconciliation attempts. This is important, as without this, the number
of errors it not too helpful. A more valuable statistic is the
proportion of reconciliations that result in an error.
This commit introduces liveness checks to Kilo. This allows the Kilo
daemons to take nodes with inactive or dead Kilo deamons out of the
topology until they are alive again.