Commit Graph

3 Commits

Author SHA1 Message Date
3bcf2e115f fix(modules): ship and load nft_numgen/hash/limit/log at boot
Some checks failed
ARM64 Build / Build generic ARM64 disk image (push) Failing after 6s
CI / Go Tests (push) Successful in 2m12s
CI / Shellcheck (push) Successful in 55s
CI / Build Go Binaries (amd64, linux, linux-amd64) (push) Successful in 1m48s
CI / Build Go Binaries (arm64, linux, linux-arm64) (push) Successful in 1m35s
After 31eee77 added CONFIG_NFT_NUMGEN=m and friends to the kernel
fragment, the rebuilt kernel does include nft_numgen.ko on disk in
build/cache/kernel-arm64-generic/modules/. But the runtime kernel
doesn't load it, and kube-proxy keeps failing with the same
"No such file or directory" pointing at `numgen` as before the
kernel rebuild.

Root cause is the boot-stage-vs-lockdown ordering combined with
inject-kubesolo.sh's selective module copy:

  1. inject-kubesolo.sh ships modules listed in modules.list /
     modules-arm64.list plus their transitive deps. nft_numgen wasn't
     in either list, so its .ko is in the kernel build cache but
     never makes it into the initramfs.
  2. Stage 30 (kernel-modules) only modprobes from the same list, so
     it wouldn't load nft_numgen even if the .ko were present.
  3. Stage 85 (security-lockdown) writes 1 to
     /proc/sys/kernel/modules_disabled, blocking any further module
     loads — including the lazy request_module() that nftables would
     otherwise do when kube-proxy first uses the `numgen` expression.

The kernel-side fix (=m in the fragment) is necessary but not
sufficient: we have to ship + load these in stage 30, before lockdown.

Add nft_numgen, nft_hash, nft_limit, nft_log to BOTH modules.list
(x86) and modules-arm64.list. Same justification on x86 — KubeSolo's
nftables kube-proxy backend uses numgen regardless of arch, we just
haven't exercised it on x86 since v0.2 deployments stuck with the
older iptables-restore backend.

After this lands on the Odroid:

  sudo make rootfs-arm64 disk-image-arm64   # kernel cached, rootfs only
  # no kernel rebuild needed; this is a rootfs-only change

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 14:25:11 -06:00
39732488ef feat: custom kernel build + boot fixes for working container runtime
Build a custom Tiny Core 17.0 kernel (6.18.2) with missing configs
that the stock kernel lacks for container workloads:
- CONFIG_CGROUP_BPF=y (cgroup v2 device control via BPF)
- CONFIG_DEVTMPFS=y (auto-create /dev device nodes)
- CONFIG_DEVTMPFS_MOUNT=y (auto-mount devtmpfs)
- CONFIG_MEMCG=y (memory cgroup controller for memory.max)
- CONFIG_CFS_BANDWIDTH=y (CPU bandwidth throttling for cpu.max)

Also strips unnecessary subsystems (sound, GPU, wireless, Bluetooth,
KVM, etc.) for minimal footprint on a headless K8s edge appliance.

Init system fixes for successful boot-to-running-pods:
- Add switch_root in init.sh to escape initramfs (runc pivot_root)
- Add mountpoint guards in 00-early-mount.sh (skip if already mounted)
- Create essential device nodes after switch_root (kmsg, console, etc.)
- Enable cgroup v2 controller delegation with init process isolation
- Mount BPF filesystem for cgroup v2 device control
- Add mknod fallback from sysfs in 20-persistent-mount.sh for /dev/vda
- Move KubeSolo binary to /usr/bin (avoid /usr/local bind mount hiding)
- Generate /etc/machine-id in 60-hostname.sh (kubelet requires it)
- Pre-initialize iptables tables before kube-proxy starts
- Add nft_reject, nft_fib, xt_nfacct to kernel modules list

Build system changes:
- New build-kernel.sh script for custom kernel compilation
- Dockerfile.builder adds kernel build deps (flex, bison, libelf, etc.)
- Selective kernel module install (only modules.list + transitive deps)
- Install iptables-nft (xtables-nft-multi) + shared libs in rootfs

Tested: ISO boots in QEMU, node reaches Ready in ~35s, CoreDNS and
local-path-provisioner pods start and run successfully.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 23:13:31 -06:00
e372df578b feat: initial Phase 1 PoC scaffolding for KubeSolo OS
Complete Phase 1 implementation of KubeSolo OS — an immutable, bootable
Linux distribution built on Tiny Core Linux for running KubeSolo
single-node Kubernetes.

Build system:
- Makefile with fetch, rootfs, initramfs, iso, disk-image targets
- Dockerfile.builder for reproducible builds
- Scripts to download Tiny Core, extract rootfs, inject KubeSolo,
  pack initramfs, and create bootable ISO/disk images

Init system (10 POSIX sh stages):
- Early mount (proc/sys/dev/cgroup2), cmdline parsing, persistent
  mount with bind-mounts, kernel module loading, sysctl, DHCP
  networking, hostname, clock sync, containerd prep, KubeSolo exec

Shared libraries:
- functions.sh (device wait, IP lookup, config helpers)
- network.sh (static IP, config persistence, interface detection)
- health.sh (containerd, API server, node readiness checks)
- Emergency shell for boot failure debugging

Testing:
- QEMU boot test with serial log marker detection
- K8s readiness test with kubectl verification
- Persistence test (reboot + verify state survives)
- Workload deployment test (nginx pod)
- Local storage test (PVC + local-path provisioner)
- Network policy test
- Reusable run-vm.sh launcher

Developer tools:
- dev-vm.sh (interactive QEMU with port forwarding)
- rebuild-initramfs.sh (fast iteration)
- inject-ssh.sh (dropbear SSH for debugging)
- extract-kernel-config.sh + kernel-audit.sh

Documentation:
- Full design document with architecture research
- Boot flow documentation covering all 10 init stages
- Cloud-init examples (DHCP, static IP, Portainer Edge, air-gapped)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 10:18:42 -06:00