Remove 'quiet' from RPi cmdline.txt so kernel probe messages are
visible on HDMI. Add comprehensive diagnostics to the data device
error path: dmesg for MMC/SDHCI/regulators/firmware, /sys/class/block
listing, and error message scanning. This will reveal why zero block
devices appear despite all kernel configs being correct.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The sdhci-iproc driver (RPi 4 SD card controller) probes via Device
Tree matching. Using DTBs from the firmware repo instead of the
kernel build caused a mismatch — the driver silently failed to probe,
resulting in zero block devices after boot.
Changes:
- Use DTBs from custom-kernel-arm64/dtbs/ (matches the kernel)
- Firmware blobs (start4.elf, fixup4.dat) still from firmware repo
- Also includes prior fix for LABEL= resolution in persistent mount
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The cmdline uses kubesolo.data=LABEL=KSOLODATA, but the wait loop
in 20-persistent-mount.sh checked [ -b "LABEL=KSOLODATA" ] which
is always false — it's a label reference, not a block device path.
Fix by detecting LABEL= prefix and resolving it to a block device
path via blkid -L in the wait loop. Also loads mmc_block module as
fallback for platforms where it's not built-in.
Adds debug output listing available block devices on failure.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The autoboot.txt A/B redirect requires newer RPi EEPROM firmware.
On older EEPROMs, autoboot.txt is silently ignored and the firmware
tries to boot from partition 1 directly — failing with a rainbow
screen because partition 1 had no kernel or initramfs.
Changes:
- Increase partition 1 from 32 MB to 384 MB
- Populate partition 1 with full boot files (kernel, initramfs,
config.txt with kernel= directive, DTBs, overlays)
- Keep autoboot.txt for A/B redirect on supported EEPROMs
- When autoboot.txt works: boots from partition 2 (A/B scheme)
- When autoboot.txt is unsupported: boots from partition 1 (fallback)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Raspberry Pi firmware reads config.txt from partition 1 BEFORE
processing autoboot.txt. Without arm_64bit=1 on the boot control
partition, the firmware defaults to 32-bit mode and shows only a
rainbow square. Add minimal config.txt, device tree blobs, and
overlays to partition 1 so the firmware can initialize correctly
before redirecting to the A/B boot partitions.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Switch from GPT to MBR (dos) partition table — GPT + autoboot.txt
fails on many Pi 4 EEPROM versions
- Copy firmware blobs (start*.elf, fixup*.dat) to partition 1 (KSOLOCTL)
so the EEPROM can find and load them
- Increase boot control partition from 16 MB to 32 MB to fit firmware
- Mark partition 1 as bootable
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- fetch-components.sh: download ARM64 KubeSolo binary (kubesolo-arm64)
- inject-kubesolo.sh: use arch-specific binaries for KubeSolo, cloud-init,
and update agent; detect KVER from custom kernel when rootfs has none;
cross-arch module resolution via find fallback when modprobe fails
- create-rpi-image.sh: kpartx support for Docker container builds
- Makefile: rootfs-arm64 depends on build-cross, includes pack-initramfs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add kpartx for reliable loop partition mapping in Docker containers
- Fix piCore64 download URL (changed from .img.gz to .zip format)
- Fix piCore64 boot partition mount (initramfs on p1, not p2)
- Fix tar --wildcards for RPi firmware extraction
- Add MIT license (same as KubeSolo)
- Add kpartx and unzip to Docker builder image
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Includes cloud-init full flag support, security hardening, AppArmor,
and ARM64 Raspberry Pi support.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add missing flags (--local-storage-shared-path, --debug, --pprof-server,
--portainer-edge-id, --portainer-edge-key, --portainer-edge-async) so all
10 documented KubeSolo parameters can be configured via cloud-init YAML.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bind kubeconfig HTTP server to 0.0.0.0:8080 (was 127.0.0.1) so integration
tests can reach it via QEMU SLIRP port forwarding. Add shared wait_for_boot
and fetch_kubeconfig helpers to qemu-helpers.sh. Update all 5 integration
tests to fetch kubeconfig via HTTP and use it for kubectl authentication.
All 6 tests pass on Linux with KVM: boot (18s), security (7/7), K8s ready
(15s), workload deploy, local storage, network policy.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The stock TinyCore kernel config has "# CONFIG_SECURITY is not set" which
caused make olddefconfig to silently revert all security configs in a single
pass. Fix by applying security configs (AppArmor, Audit, LSM) after the
first olddefconfig resolves base dependencies, then running a second pass.
Added mandatory verification that exits on missing critical configs.
All QEMU test scripts converted from broken -cdrom + -append pattern to
direct kernel boot (-kernel + -initrd) via shared test/lib/qemu-helpers.sh
helper library. The -append flag only works with -kernel, not -cdrom.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Dockerfile.builder: Go 1.24.0 → 1.25.5 (go.mod requires it)
- test-boot.sh: use direct kernel boot via ISO extraction instead of
broken -cdrom + -append; fix boot marker to "KubeSolo is running"
(Stage 90 blocks on wait, never emits "complete")
- test-security-hardening.sh: same direct kernel boot and marker fixes
- run-vm.sh, dev-vm.sh, dev-vm-arm64.sh: quote QEMU -net args to
silence shellcheck SC2054
- fetch-components.sh, fetch-rpi-firmware.sh, dev-vm-arm64.sh: fix
trap quoting (SC2064)
Validated: full Docker build, 94 Go tests pass, QEMU boot (73s),
security hardening test (6/6 pass, 1 AppArmor skip pending kernel
rebuild).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Security hardening: bind kubeconfig server to localhost, mount hardening
(noexec/nosuid/nodev on tmpfs), sysctl network hardening, kernel module
loading lock after boot, SHA256 checksum verification for downloads,
kernel AppArmor + Audit support, complain-mode AppArmor profiles for
containerd and kubelet, and security integration test.
ARM64 Raspberry Pi support: piCore64 base extraction, RPi kernel build
from raspberrypi/linux fork, RPi firmware fetch, SD card image with 4-
partition GPT and tryboot A/B mechanism, BootEnv Go interface abstracting
GRUB vs RPi boot environments, architecture-aware build scripts, QEMU
aarch64 dev VM and boot test.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Try bsdtar first (macOS + Linux with libarchive-tools)
- Fall back to isoinfo (genisoimage/cdrtools)
- Fall back to loop mount (Linux only, requires root)
- Platform-aware error messages for e2fsprogs install
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- dev-vm.sh: rewrite for macOS (bsdtar ISO extraction, Homebrew mkfs.ext4
detection, direct kernel boot, TCG acceleration, port 8080 forwarding)
- inject-kubesolo.sh: add CA certificates bundle from builder so containerd
can verify TLS when pulling from registries (Docker Hub, etc.)
- 50-network.sh: add DNS fallback (10.0.2.3 + 8.8.8.8) when DHCP client
doesn't populate /etc/resolv.conf
- 90-kubesolo.sh: serve kubeconfig via HTTP on port 8080 for reliable
retrieval from host, add 127.0.0.1 and 10.0.2.15 to API server SANs
- portainer.go: add headless Service to Edge Agent manifest (required for
agent peer discovery DNS lookup)
- 10-parse-cmdline.sh + init.sh: add kubesolo.edge_id/edge_key boot params
- 20-persistent-mount.sh: auto-format unformatted data disks on first boot
- hack/fix-portainer-service.sh: helper to patch running cluster
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
README.md rewritten to reflect all 5 design-doc phases complete with
sections for custom kernel, cloud-init, atomic updates, monitoring,
full make targets table, and documentation links.
CHANGELOG.md created with detailed v0.1.0 release notes covering
all features across all phases.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Gitea Actions CI pipeline: Go tests, build, shellcheck on push/PR
- Gitea Actions release pipeline: full build + artifact upload on version tags
- OCI container image builder for registry-based OS distribution
- Zero-dependency Prometheus metrics endpoint (kubesolo_os_info, boot,
memory, update status) with 10 tests
- USB provisioning tool for air-gapped deployments with cloud-init injection
- ARM64 cross-compilation support (TARGET_ARCH env var + build-cross.sh)
- Updated build scripts to accept TARGET_ARCH for both amd64 and arm64
- New Makefile targets: oci-image, build-cross
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement atomic OS updates via A/B partition scheme with automatic
rollback. GRUB bootloader manages slot selection with a 3-attempt
boot counter that auto-rolls back on repeated health check failures.
GRUB boot config:
- A/B slot selection with boot_counter/boot_success env vars
- Automatic rollback when counter reaches 0 (3 failed boots)
- Debug, emergency shell, and manual slot-switch menu entries
Disk image (refactored):
- 4-partition GPT layout: EFI + System A + System B + Data
- GRUB EFI/BIOS installation with graceful fallbacks
- Both system partitions populated during image creation
Update agent (Go, zero external deps):
- pkg/grubenv: read/write GRUB env vars (grub-editenv + manual fallback)
- pkg/partition: find/mount/write system partitions by label
- pkg/image: HTTP download with SHA256 verification
- pkg/health: post-boot checks (containerd, API server, node Ready)
- 6 CLI commands: check, apply, activate, rollback, healthcheck, status
- 37 unit tests across all 4 packages
Deployment:
- K8s CronJob for automatic update checks (every 6 hours)
- ConfigMap for update server URL
- Health check Job for post-boot verification
Build pipeline:
- build-update-agent.sh compiles static Linux binary (~5.9 MB)
- inject-kubesolo.sh includes update agent in initramfs
- Makefile: build-update-agent, test-update-agent, test-update targets
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement a lightweight cloud-init system for first-boot configuration:
- Go parser for YAML config (hostname, network, KubeSolo settings)
- Static/DHCP network modes with DNS override
- KubeSolo extra flags and API server SAN configuration
- Portainer Edge Agent and air-gapped deployment support
- New init stage 45-cloud-init.sh runs before network/hostname stages
- Stages 50/60 skip gracefully when cloud-init has already applied
- Build script compiles static Linux/amd64 binary (~2.7 MB)
- 17 unit tests covering parsing, validation, and example files
- Full documentation at docs/cloud-init.md
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>