feat: custom kernel build + boot fixes for working container runtime

Build a custom Tiny Core 17.0 kernel (6.18.2) with missing configs
that the stock kernel lacks for container workloads:
- CONFIG_CGROUP_BPF=y (cgroup v2 device control via BPF)
- CONFIG_DEVTMPFS=y (auto-create /dev device nodes)
- CONFIG_DEVTMPFS_MOUNT=y (auto-mount devtmpfs)
- CONFIG_MEMCG=y (memory cgroup controller for memory.max)
- CONFIG_CFS_BANDWIDTH=y (CPU bandwidth throttling for cpu.max)

Also strips unnecessary subsystems (sound, GPU, wireless, Bluetooth,
KVM, etc.) for minimal footprint on a headless K8s edge appliance.

Init system fixes for successful boot-to-running-pods:
- Add switch_root in init.sh to escape initramfs (runc pivot_root)
- Add mountpoint guards in 00-early-mount.sh (skip if already mounted)
- Create essential device nodes after switch_root (kmsg, console, etc.)
- Enable cgroup v2 controller delegation with init process isolation
- Mount BPF filesystem for cgroup v2 device control
- Add mknod fallback from sysfs in 20-persistent-mount.sh for /dev/vda
- Move KubeSolo binary to /usr/bin (avoid /usr/local bind mount hiding)
- Generate /etc/machine-id in 60-hostname.sh (kubelet requires it)
- Pre-initialize iptables tables before kube-proxy starts
- Add nft_reject, nft_fib, xt_nfacct to kernel modules list

Build system changes:
- New build-kernel.sh script for custom kernel compilation
- Dockerfile.builder adds kernel build deps (flex, bison, libelf, etc.)
- Selective kernel module install (only modules.list + transitive deps)
- Install iptables-nft (xtables-nft-multi) + shared libs in rootfs

Tested: ISO boots in QEMU, node reaches Ready in ~35s, CoreDNS and
local-path-provisioner pods start and run successfully.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-11 23:13:31 -06:00
parent 456aa8eb5b
commit 39732488ef
13 changed files with 794 additions and 77 deletions

View File

@@ -31,4 +31,17 @@ hostname "$HOSTNAME"
echo "$HOSTNAME" > /etc/hostname
echo "127.0.0.1 $HOSTNAME" >> /etc/hosts
# Generate /etc/machine-id if missing (kubelet requires it)
if [ ! -f /etc/machine-id ]; then
if [ -f "$DATA_MOUNT/etc-kubesolo/machine-id" ]; then
cp "$DATA_MOUNT/etc-kubesolo/machine-id" /etc/machine-id
else
# Generate from hostname hash (deterministic across reboots)
printf '%s' "$HOSTNAME" | md5sum 2>/dev/null | cut -d' ' -f1 > /etc/machine-id || \
cat /proc/sys/kernel/random/uuid 2>/dev/null | tr -d '-' > /etc/machine-id || true
# Persist for next boot
cp /etc/machine-id "$DATA_MOUNT/etc-kubesolo/machine-id" 2>/dev/null || true
fi
fi
log_ok "Hostname set to: $HOSTNAME"