Commit Graph

8 Commits

Author SHA1 Message Date
7e46f8fdc2 fix(kernel): enable nftables address-family handlers
Some checks failed
ARM64 Build / Build generic ARM64 disk image (push) Failing after 6s
CI / Go Tests (push) Successful in 2m40s
CI / Shellcheck (push) Successful in 1m39s
CI / Build Go Binaries (amd64, linux, linux-amd64) (push) Failing after 10s
CI / Build Go Binaries (arm64, linux, linux-arm64) (push) Failing after 7s
Third KubeSolo crash from the QEMU validation loop:

  nft add table ip kubesolo-masq: exit status 1
    Error: Could not process rule: Operation not supported

That's EOPNOTSUPP from netlink. nf_tables core is loaded (the binary
even runs cleanly now after the previous dual-glibc fix), but no address
families are registered with it — so any `nft add table ip ...`,
`add table inet ...`, etc. is rejected.

In modern Linux (5.x / 6.x) the nftables address families are gated by
separate BOOL Kconfigs:

  CONFIG_NF_TABLES_IPV4    "ip" family
  CONFIG_NF_TABLES_IPV6    "ip6" family
  CONFIG_NF_TABLES_INET    "inet" family (both)
  CONFIG_NF_TABLES_NETDEV  "netdev" family

These are bool (not tristate) — they must be built into the kernel; no
module to load at runtime. Our shared kernel-container.fragment had
CONFIG_NF_TABLES=m (the core) but none of the family Kconfigs, and the
arm64 defconfig leaves them off.

Fix: enable all four families as =y in kernel-container.fragment.
Also pin the NFT expression modules KubeSolo v1.1.4+'s masquerade
ruleset depends on (NFT_NAT, NFT_MASQ, NFT_CT, NFT_REDIR, NFT_REJECT,
NFT_REJECT_INET, NFT_COMPAT, NFT_FIB + FIB_IPV4/6) as =m — they're
already in modules-arm64.list / modules.list and get modprobed at boot,
this just makes sure olddefconfig doesn't strip them when applied on
top of a minimal defconfig.

NF_NAT_MASQUERADE pinned =y because NFT_MASQ select-depends on it; on
some kernels it would get auto-selected, on others it gets dropped by
olddefconfig if not pinned.

This change requires a kernel rebuild — the configs are bool / module
defs, not runtime knobs. On the Odroid:

  rm -rf build/cache/kernel-arm64-generic
  sudo make kernel-arm64       # ~30-60 min from scratch
  sudo make rootfs-arm64 disk-image-arm64

x86 needs the same treatment when we cut v0.3.1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 08:55:41 -06:00
1b44c9d621 feat: bump KubeSolo to v1.1.5 + cross-arch CI workflow
Some checks failed
ARM64 Build / Build generic ARM64 disk image (push) Failing after 3s
CI / Go Tests (push) Successful in 1m27s
CI / Shellcheck (push) Failing after 50s
CI / Build Go Binaries (amd64, linux, linux-amd64) (push) Failing after 1m33s
CI / Build Go Binaries (arm64, linux, linux-arm64) (push) Failing after 1m15s
Phase 4 of v0.3 — KubeSolo version bump and CI gating.

KubeSolo v1.1.0 → v1.1.5 brings:
- New flag --disable-ipv6 (v1.1.5)
- New flag --db-wal-repair (v1.1.5) — important for power-loss resilience
  on edge appliances; surfaced as kubesolo.db-wal-repair in cloud-init
- New flag --full (v1.1.4) — disables edge-optimised k8s overrides
- Pod egress connectivity fix after reboot (v1.1.4)
- Registry config persistence fix (v1.1.5)
- k8s 1.34.7, CoreDNS 1.14.3, Go 1.26.2

All three new flags wired into cloud-init: config.go fields, kubesolo.go
extra-flag emission, full-config.yaml example.

Supply-chain hygiene:
- Per-arch checksums: KUBESOLO_SHA256_AMD64 and KUBESOLO_SHA256_ARM64 in
  versions.env. Replaces the single shared KUBESOLO_SHA256 that couldn't
  meaningfully verify both binaries at once.
- Checksum now applied to the tarball (the immutable upstream artifact)
  rather than the post-extract binary.

CI:
- New .gitea/workflows/build-arm64.yaml routes the full kernel + rootfs +
  disk-image build to the Odroid arm64-linux runner. Triggers on push to
  main, tags, and manual workflow_dispatch. The boot smoke test is
  continue-on-error because KubeSolo's first-boot image import deadline
  fires under QEMU TCG on the Odroid.

VERSION bumped to 0.3.0-dev. CHANGELOG entry under [0.3.0-dev] captures all
Phase 1-4 work + the known limitations documented in arm64-status.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 16:26:20 -06:00
d51618badb build: separate generic ARM64 from Raspberry Pi kernel builds
Splits the ARM64 build into two tracks per docs/arm64-architecture.md:

Generic ARM64 (mainline kernel.org, UEFI, virtio, GRUB):
- New build/scripts/build-kernel-arm64.sh builds mainline LTS (6.12.x by default)
  from arm64 defconfig + shared container fragment + arm64-virt enables
  (VIRTIO_*, EFI_STUB, NVMe). Output: build/cache/kernel-arm64-generic/.
- New Makefile targets: kernel-arm64, rootfs-arm64 (now consumes the mainline
  kernel modules via TARGET_VARIANT=generic).
- versions.env: pin MAINLINE_KERNEL_VERSION=6.12.10, declare cdn.kernel.org URL
  and SHA256 placeholder.

Raspberry Pi (raspberrypi/linux fork, custom DTBs, autoboot.txt):
- build-kernel-arm64.sh (RPi-flavoured) renamed to build-kernel-rpi.sh; cache
  dir renamed from custom-kernel-arm64 to custom-kernel-rpi.
- New Makefile targets: kernel-rpi, rootfs-arm64-rpi (uses TARGET_VARIANT=rpi).
- rpi-image now depends on rootfs-arm64-rpi + kernel-rpi instead of the generic
  rootfs-arm64.
- create-rpi-image.sh + inject-kubesolo.sh updated to reference the new cache
  path. inject-kubesolo.sh now takes a TARGET_VARIANT env var (rpi|generic) to
  select which ARM64 kernel modules to consume.

Shared substrate:
- rpi-kernel-config.fragment renamed to kernel-container.fragment. The contents
  were never RPi-specific (cgroup, namespaces, AppArmor, netfilter) — just
  misnamed. Extended with extra subsystem disables (KVM, WLAN, CFG80211,
  INFINIBAND, PCMCIA, HAMRADIO, ISDN, ATM, INPUT_JOYSTICK, INPUT_TABLET, FPGA)
  and CONFIG_LSM=lockdown,yama,apparmor.
- build-kernel.sh (x86) refactored to apply the shared fragment via a generic
  apply_fragment function (two-pass for the TC stock config security dance),
  killing ~50 lines of inline config duplication.

Note: rename detection shows build-kernel-arm64.sh as 'modified' because the
new file at that path is the mainline build, while the old RPi-flavoured
content lives in build-kernel-rpi.sh (which appears as a new file). The git
log for build-kernel-rpi.sh is empty; the RPi history is preserved at the
original path until this commit.

No actual kernel build runs in this commit — that's Phase 3 work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 10:30:11 -06:00
059ec7955f chore: housekeeping for v0.3 prep
- Pin KUBESOLO_VERSION in versions.env (was soft-defaulted in fetch-components.sh)
- Gitignore screenshots, macOS resource forks, and common image extensions
- Update README roadmap: x86_64 stable, ARM64 generic in progress (v0.3),
  ARM64 RPi paused pending hardware
- Add docs/ci-runners.md documenting the Odroid arm64-linux Gitea runner

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 09:44:01 -06:00
09dcea84ef fix: disk image build, piCore64 URL, license
Some checks failed
CI / Go Tests (push) Has been cancelled
CI / Build Go Binaries (amd64, linux, linux-amd64) (push) Has been cancelled
CI / Build Go Binaries (arm64, linux, linux-arm64) (push) Has been cancelled
CI / Shellcheck (push) Has been cancelled
Release / Test (push) Has been cancelled
Release / Build Binaries (amd64, linux, linux-amd64) (push) Has been cancelled
Release / Build Binaries (arm64, linux, linux-arm64) (push) Has been cancelled
Release / Build ISO (amd64) (push) Has been cancelled
Release / Create Release (push) Has been cancelled
- Add kpartx for reliable loop partition mapping in Docker containers
- Fix piCore64 download URL (changed from .img.gz to .zip format)
- Fix piCore64 boot partition mount (initramfs on p1, not p2)
- Fix tar --wildcards for RPi firmware extraction
- Add MIT license (same as KubeSolo)
- Add kpartx and unzip to Docker builder image

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 17:05:03 -06:00
efc7f80b65 feat: add security hardening, AppArmor, and ARM64 Raspberry Pi support (Phase 6)
Some checks failed
CI / Go Tests (push) Has been cancelled
CI / Build Go Binaries (amd64, linux, linux-amd64) (push) Has been cancelled
CI / Build Go Binaries (arm64, linux, linux-arm64) (push) Has been cancelled
CI / Shellcheck (push) Has been cancelled
Security hardening: bind kubeconfig server to localhost, mount hardening
(noexec/nosuid/nodev on tmpfs), sysctl network hardening, kernel module
loading lock after boot, SHA256 checksum verification for downloads,
kernel AppArmor + Audit support, complain-mode AppArmor profiles for
containerd and kubelet, and security integration test.

ARM64 Raspberry Pi support: piCore64 base extraction, RPi kernel build
from raspberrypi/linux fork, RPi firmware fetch, SD card image with 4-
partition GPT and tryboot A/B mechanism, BootEnv Go interface abstracting
GRUB vs RPi boot environments, architecture-aware build scripts, QEMU
aarch64 dev VM and boot test.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 13:08:17 -06:00
39732488ef feat: custom kernel build + boot fixes for working container runtime
Build a custom Tiny Core 17.0 kernel (6.18.2) with missing configs
that the stock kernel lacks for container workloads:
- CONFIG_CGROUP_BPF=y (cgroup v2 device control via BPF)
- CONFIG_DEVTMPFS=y (auto-create /dev device nodes)
- CONFIG_DEVTMPFS_MOUNT=y (auto-mount devtmpfs)
- CONFIG_MEMCG=y (memory cgroup controller for memory.max)
- CONFIG_CFS_BANDWIDTH=y (CPU bandwidth throttling for cpu.max)

Also strips unnecessary subsystems (sound, GPU, wireless, Bluetooth,
KVM, etc.) for minimal footprint on a headless K8s edge appliance.

Init system fixes for successful boot-to-running-pods:
- Add switch_root in init.sh to escape initramfs (runc pivot_root)
- Add mountpoint guards in 00-early-mount.sh (skip if already mounted)
- Create essential device nodes after switch_root (kmsg, console, etc.)
- Enable cgroup v2 controller delegation with init process isolation
- Mount BPF filesystem for cgroup v2 device control
- Add mknod fallback from sysfs in 20-persistent-mount.sh for /dev/vda
- Move KubeSolo binary to /usr/bin (avoid /usr/local bind mount hiding)
- Generate /etc/machine-id in 60-hostname.sh (kubelet requires it)
- Pre-initialize iptables tables before kube-proxy starts
- Add nft_reject, nft_fib, xt_nfacct to kernel modules list

Build system changes:
- New build-kernel.sh script for custom kernel compilation
- Dockerfile.builder adds kernel build deps (flex, bison, libelf, etc.)
- Selective kernel module install (only modules.list + transitive deps)
- Install iptables-nft (xtables-nft-multi) + shared libs in rootfs

Tested: ISO boots in QEMU, node reaches Ready in ~35s, CoreDNS and
local-path-provisioner pods start and run successfully.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 23:13:31 -06:00
e372df578b feat: initial Phase 1 PoC scaffolding for KubeSolo OS
Complete Phase 1 implementation of KubeSolo OS — an immutable, bootable
Linux distribution built on Tiny Core Linux for running KubeSolo
single-node Kubernetes.

Build system:
- Makefile with fetch, rootfs, initramfs, iso, disk-image targets
- Dockerfile.builder for reproducible builds
- Scripts to download Tiny Core, extract rootfs, inject KubeSolo,
  pack initramfs, and create bootable ISO/disk images

Init system (10 POSIX sh stages):
- Early mount (proc/sys/dev/cgroup2), cmdline parsing, persistent
  mount with bind-mounts, kernel module loading, sysctl, DHCP
  networking, hostname, clock sync, containerd prep, KubeSolo exec

Shared libraries:
- functions.sh (device wait, IP lookup, config helpers)
- network.sh (static IP, config persistence, interface detection)
- health.sh (containerd, API server, node readiness checks)
- Emergency shell for boot failure debugging

Testing:
- QEMU boot test with serial log marker detection
- K8s readiness test with kubectl verification
- Persistence test (reboot + verify state survives)
- Workload deployment test (nginx pod)
- Local storage test (PVC + local-path provisioner)
- Network policy test
- Reusable run-vm.sh launcher

Developer tools:
- dev-vm.sh (interactive QEMU with port forwarding)
- rebuild-initramfs.sh (fast iteration)
- inject-ssh.sh (dropbear SSH for debugging)
- extract-kernel-config.sh + kernel-audit.sh

Documentation:
- Full design document with architecture research
- Boot flow documentation covering all 10 init stages
- Cloud-init examples (DHCP, static IP, Portainer Edge, air-gapped)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 10:18:42 -06:00