Remove [KSOLO-DBG] per-step echos from init.sh. The /dev/console redirect
stays — it's load-bearing for early-boot visibility on QEMU virt.
Add docs/arm64-status.md capturing the end-of-Phase-3 state:
- What works (full boot through 14 stages, KubeSolo + containerd start)
- Known limitations of the dev setup (QEMU TCG perf, /dev/vda4 hardcode,
busybox-static gaps)
- What's needed to ship v0.3 ARM64 as production-ready
Real-hardware validation (Graviton, Ampere, or similar) is the next gating
step before we can call ARM64 generic done.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Ubuntu's busybox-static 1.30.1 (which we use for the ARM64 rootfs after
piCore64's BusyBox crashes in QEMU virt) doesn't recognize POSIX character
classes. `tr -d '[:space:]'` is interpreted as "delete any of the literal
characters [, :, s, p, a, c, e, ]" — so every s/p/a/c/e in module names and
sysctl keys gets eaten.
Symptoms in the boot log:
virtio_net -> virtio_nt (e dropped)
overlay -> ovrly (e, a dropped)
bridge -> bridg (e dropped)
nf_conntrack -> nf_onntrk (c, a, c dropped)
net.bridge.bridge-nf-call-iptables -> nt.bridg.bridg-nf-ll-itbl
Fix: use explicit whitespace chars `tr -d ' \t\r\n'` in both
30-kernel-modules.sh and 40-sysctl.sh. Works under any tr implementation.
Also: filter functions.sh out of the init.d stage-copy loop. It's a shared
library (sourced by init.sh), not a numbered stage. With it in init.d the
main loop runs it as a stage after stage 90, then panics with "Init
completed without exec'ing KubeSolo".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The ARM64 generic boot is failing with 'Segmentation fault' from a child
process before any visible init output. Adding per-step debug lines to
narrow down which mount/mkdir crashes.
To revert: git revert <this commit> before tagging v0.3.0.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Instead of returning 1 (which triggers kernel panic via set -e before
emergency_shell runs), exec an interactive shell on /dev/console so
the user can run dmesg and debug interactively. Add initcall_debug
and loglevel=7 to cmdline.txt to show every driver probe during boot.
Also dump last 60 lines of dmesg before dropping to shell.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove 'quiet' from RPi cmdline.txt so kernel probe messages are
visible on HDMI. Add comprehensive diagnostics to the data device
error path: dmesg for MMC/SDHCI/regulators/firmware, /sys/class/block
listing, and error message scanning. This will reveal why zero block
devices appear despite all kernel configs being correct.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The cmdline uses kubesolo.data=LABEL=KSOLODATA, but the wait loop
in 20-persistent-mount.sh checked [ -b "LABEL=KSOLODATA" ] which
is always false — it's a label reference, not a block device path.
Fix by detecting LABEL= prefix and resolving it to a block device
path via blkid -L in the wait loop. Also loads mmc_block module as
fallback for platforms where it's not built-in.
Adds debug output listing available block devices on failure.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bind kubeconfig HTTP server to 0.0.0.0:8080 (was 127.0.0.1) so integration
tests can reach it via QEMU SLIRP port forwarding. Add shared wait_for_boot
and fetch_kubeconfig helpers to qemu-helpers.sh. Update all 5 integration
tests to fetch kubeconfig via HTTP and use it for kubectl authentication.
All 6 tests pass on Linux with KVM: boot (18s), security (7/7), K8s ready
(15s), workload deploy, local storage, network policy.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Security hardening: bind kubeconfig server to localhost, mount hardening
(noexec/nosuid/nodev on tmpfs), sysctl network hardening, kernel module
loading lock after boot, SHA256 checksum verification for downloads,
kernel AppArmor + Audit support, complain-mode AppArmor profiles for
containerd and kubelet, and security integration test.
ARM64 Raspberry Pi support: piCore64 base extraction, RPi kernel build
from raspberrypi/linux fork, RPi firmware fetch, SD card image with 4-
partition GPT and tryboot A/B mechanism, BootEnv Go interface abstracting
GRUB vs RPi boot environments, architecture-aware build scripts, QEMU
aarch64 dev VM and boot test.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- dev-vm.sh: rewrite for macOS (bsdtar ISO extraction, Homebrew mkfs.ext4
detection, direct kernel boot, TCG acceleration, port 8080 forwarding)
- inject-kubesolo.sh: add CA certificates bundle from builder so containerd
can verify TLS when pulling from registries (Docker Hub, etc.)
- 50-network.sh: add DNS fallback (10.0.2.3 + 8.8.8.8) when DHCP client
doesn't populate /etc/resolv.conf
- 90-kubesolo.sh: serve kubeconfig via HTTP on port 8080 for reliable
retrieval from host, add 127.0.0.1 and 10.0.2.15 to API server SANs
- portainer.go: add headless Service to Edge Agent manifest (required for
agent peer discovery DNS lookup)
- 10-parse-cmdline.sh + init.sh: add kubesolo.edge_id/edge_key boot params
- 20-persistent-mount.sh: auto-format unformatted data disks on first boot
- hack/fix-portainer-service.sh: helper to patch running cluster
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement a lightweight cloud-init system for first-boot configuration:
- Go parser for YAML config (hostname, network, KubeSolo settings)
- Static/DHCP network modes with DNS override
- KubeSolo extra flags and API server SAN configuration
- Portainer Edge Agent and air-gapped deployment support
- New init stage 45-cloud-init.sh runs before network/hostname stages
- Stages 50/60 skip gracefully when cloud-init has already applied
- Build script compiles static Linux/amd64 binary (~2.7 MB)
- 17 unit tests covering parsing, validation, and example files
- Full documentation at docs/cloud-init.md
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>