From 31eee77397a1ff3ac8c2cd6ff87ee4b55bd421cc Mon Sep 17 00:00:00 2001 From: Adolfo Delorenzo Date: Fri, 15 May 2026 11:48:43 -0600 Subject: [PATCH] fix(kernel): enable nftables NUMGEN + HASH + helper expressions MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fourth round of the v0.3 nftables-on-arm64 debug saga. After the NF_TABLES_IPV4 family fix from 7e46f8f, KubeSolo + containerd + a CoreDNS pod all reach Running state, but kube-proxy fails to install Service rules: add rule ip kube-proxy service-2QRHZV4L-default/kubernetes/tcp/https numgen random mod 1 vmap { 0 : goto ... } ^^^^^^^^^^^^^^^^^^^ Error: Could not process rule: No such file or directory The caret points at `numgen random mod 1`. That's the nftables NUMGEN expression — kube-proxy's nftables backend uses it for random endpoint load-balancing across Service endpoints. Without CONFIG_NFT_NUMGEN compiled into the kernel, every Service sync fails and kube-dns / any ClusterIP is unreachable. Cascade: kube-proxy sync fail -> kube-dns Service has no DNAT -> CoreDNS readiness probe never goes Ready -> KubeSolo's coredns deploy step times out after 15 attempts -> FTL -> kernel panic. Fix: add NFT_NUMGEN to kernel-container.fragment, plus the small family of expression modules kube-proxy and CNI plugins commonly use so we don't repeat this debug loop for the next missing one: CONFIG_NFT_NUMGEN=m random / inc LB CONFIG_NFT_HASH=m consistent-hash LB (sessionAffinity=ClientIP) CONFIG_NFT_OBJREF=m named objects (counters, quotas) refs in rules CONFIG_NFT_LIMIT=m rate-limit expression CONFIG_NFT_LOG=m log expression (used by some CNI debug rules) All =m so init's stage-30 loads them from modules.list / modules-arm64.list alongside the existing nft_nat / nft_masq / nft_compat. This needs another kernel rebuild (rm -rf build/cache/kernel-arm64-generic, sudo make kernel-arm64) on the Odroid. After that we should have a fully working KubeSolo OS v0.3 on ARM64 generic — at which point the only thing left is to tag v0.3.1 and verify the rewritten release.yaml workflow publishes both arches automatically. Note on runc-PATH log noise: containerd-shim-runc-v2 -info probes for runc in $PATH and fails because KubeSolo's runc lives at /var/lib/kubesolo/containerd/runc. This is cosmetic — actual container creation uses an absolute path from the containerd config and works fine (CoreDNS container did start successfully). Will polish in v0.3.2. Co-Authored-By: Claude Opus 4.7 (1M context) --- build/config/kernel-container.fragment | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/build/config/kernel-container.fragment b/build/config/kernel-container.fragment index 9a03999..a8bfde0 100644 --- a/build/config/kernel-container.fragment +++ b/build/config/kernel-container.fragment @@ -62,9 +62,9 @@ CONFIG_NF_TABLES_IPV6=y CONFIG_NF_TABLES_INET=y CONFIG_NF_TABLES_NETDEV=y -# nftables expression modules used by KubeSolo's masquerade ruleset and -# kube-proxy's nft-compat path. Listed in modules.list / modules-arm64.list -# so init loads them at boot. +# nftables expression modules used by KubeSolo's masquerade ruleset, the +# kube-proxy nft backend (Kubernetes 1.34+), and the xtables compat path. +# Listed in modules.list / modules-arm64.list so init loads them at boot. CONFIG_NFT_NAT=m CONFIG_NFT_MASQ=m CONFIG_NFT_CT=m @@ -75,6 +75,18 @@ CONFIG_NFT_COMPAT=m CONFIG_NFT_FIB=m CONFIG_NFT_FIB_IPV4=m CONFIG_NFT_FIB_IPV6=m +# numgen drives kube-proxy's random / round-robin endpoint LB: +# `numgen random mod N vmap { ... }` in service rules. +# Without it kube-proxy's nft sync fails with ENOENT on every service. +CONFIG_NFT_NUMGEN=m +# hash drives consistent-hash LB (sessionAffinity=ClientIP, etc.). +CONFIG_NFT_HASH=m +# objref / limit / log are used by various policy expressions kube-proxy and +# CNI plugins emit. Including them pre-empts a future "could not process +# rule" debug loop. +CONFIG_NFT_OBJREF=m +CONFIG_NFT_LIMIT=m +CONFIG_NFT_LOG=m # IPv4 NAT bits NFT_MASQ depends on. Auto-selected on most kernels but we # pin them explicitly so olddefconfig doesn't strip them when the fragment