Commit Graph

8 Commits

Author SHA1 Message Date
ba4812f637 fix: complete ARM64 RPi build pipeline
Some checks failed
CI / Go Tests (push) Has been cancelled
CI / Build Go Binaries (amd64, linux, linux-amd64) (push) Has been cancelled
CI / Build Go Binaries (arm64, linux, linux-arm64) (push) Has been cancelled
CI / Shellcheck (push) Has been cancelled
Release / Test (push) Has been cancelled
Release / Build Binaries (amd64, linux, linux-amd64) (push) Has been cancelled
Release / Build Binaries (arm64, linux, linux-arm64) (push) Has been cancelled
Release / Build ISO (amd64) (push) Has been cancelled
Release / Create Release (push) Has been cancelled
- fetch-components.sh: download ARM64 KubeSolo binary (kubesolo-arm64)
- inject-kubesolo.sh: use arch-specific binaries for KubeSolo, cloud-init,
  and update agent; detect KVER from custom kernel when rootfs has none;
  cross-arch module resolution via find fallback when modprobe fails
- create-rpi-image.sh: kpartx support for Docker container builds
- Makefile: rootfs-arm64 depends on build-cross, includes pack-initramfs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 17:20:04 -06:00
efc7f80b65 feat: add security hardening, AppArmor, and ARM64 Raspberry Pi support (Phase 6)
Some checks failed
CI / Go Tests (push) Has been cancelled
CI / Build Go Binaries (amd64, linux, linux-amd64) (push) Has been cancelled
CI / Build Go Binaries (arm64, linux, linux-arm64) (push) Has been cancelled
CI / Shellcheck (push) Has been cancelled
Security hardening: bind kubeconfig server to localhost, mount hardening
(noexec/nosuid/nodev on tmpfs), sysctl network hardening, kernel module
loading lock after boot, SHA256 checksum verification for downloads,
kernel AppArmor + Audit support, complain-mode AppArmor profiles for
containerd and kubelet, and security integration test.

ARM64 Raspberry Pi support: piCore64 base extraction, RPi kernel build
from raspberrypi/linux fork, RPi firmware fetch, SD card image with 4-
partition GPT and tryboot A/B mechanism, BootEnv Go interface abstracting
GRUB vs RPi boot environments, architecture-aware build scripts, QEMU
aarch64 dev VM and boot test.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 13:08:17 -06:00
39732488ef feat: custom kernel build + boot fixes for working container runtime
Build a custom Tiny Core 17.0 kernel (6.18.2) with missing configs
that the stock kernel lacks for container workloads:
- CONFIG_CGROUP_BPF=y (cgroup v2 device control via BPF)
- CONFIG_DEVTMPFS=y (auto-create /dev device nodes)
- CONFIG_DEVTMPFS_MOUNT=y (auto-mount devtmpfs)
- CONFIG_MEMCG=y (memory cgroup controller for memory.max)
- CONFIG_CFS_BANDWIDTH=y (CPU bandwidth throttling for cpu.max)

Also strips unnecessary subsystems (sound, GPU, wireless, Bluetooth,
KVM, etc.) for minimal footprint on a headless K8s edge appliance.

Init system fixes for successful boot-to-running-pods:
- Add switch_root in init.sh to escape initramfs (runc pivot_root)
- Add mountpoint guards in 00-early-mount.sh (skip if already mounted)
- Create essential device nodes after switch_root (kmsg, console, etc.)
- Enable cgroup v2 controller delegation with init process isolation
- Mount BPF filesystem for cgroup v2 device control
- Add mknod fallback from sysfs in 20-persistent-mount.sh for /dev/vda
- Move KubeSolo binary to /usr/bin (avoid /usr/local bind mount hiding)
- Generate /etc/machine-id in 60-hostname.sh (kubelet requires it)
- Pre-initialize iptables tables before kube-proxy starts
- Add nft_reject, nft_fib, xt_nfacct to kernel modules list

Build system changes:
- New build-kernel.sh script for custom kernel compilation
- Dockerfile.builder adds kernel build deps (flex, bison, libelf, etc.)
- Selective kernel module install (only modules.list + transitive deps)
- Install iptables-nft (xtables-nft-multi) + shared libs in rootfs

Tested: ISO boots in QEMU, node reaches Ready in ~35s, CoreDNS and
local-path-provisioner pods start and run successfully.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 23:13:31 -06:00
456aa8eb5b feat: add distribution and fleet management — CI/CD, OCI, metrics, ARM64 (Phase 5)
Some checks failed
CI / Go Tests (push) Has been cancelled
CI / Build Go Binaries (amd64, linux, linux-amd64) (push) Has been cancelled
CI / Build Go Binaries (arm64, linux, linux-arm64) (push) Has been cancelled
CI / Shellcheck (push) Has been cancelled
- Gitea Actions CI pipeline: Go tests, build, shellcheck on push/PR
- Gitea Actions release pipeline: full build + artifact upload on version tags
- OCI container image builder for registry-based OS distribution
- Zero-dependency Prometheus metrics endpoint (kubesolo_os_info, boot,
  memory, update status) with 10 tests
- USB provisioning tool for air-gapped deployments with cloud-init injection
- ARM64 cross-compilation support (TARGET_ARCH env var + build-cross.sh)
- Updated build scripts to accept TARGET_ARCH for both amd64 and arm64
- New Makefile targets: oci-image, build-cross

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 11:36:53 -06:00
49a37e30e8 feat: add production hardening — Ed25519 signing, Portainer Edge, SSH extension (Phase 4)
Image signing:
- Ed25519 sign/verify package (pure Go stdlib, zero deps)
- genkey and sign CLI subcommands for build system
- Optional --pubkey flag for verifying updates on apply
- Signature URLs in update metadata (latest.json)

Portainer Edge Agent:
- cloud-init portainer.go module writes K8s manifest
- Auto-deploys Edge Agent when portainer.edge-agent.enabled
- Full RBAC (ServiceAccount, ClusterRoleBinding, Deployment)
- 5 Portainer tests in portainer_test.go

Production tooling:
- SSH debug extension builder (hack/build-ssh-extension.sh)
- Boot performance benchmark (test/benchmark/bench-boot.sh)
- Resource usage benchmark (test/benchmark/bench-resources.sh)
- Deployment guide (docs/deployment-guide.md)

Test results: 50 update agent tests + 22 cloud-init tests passing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 11:26:23 -06:00
8d25e1890e feat: add A/B partition updates with GRUB and Go update agent (Phase 3)
Implement atomic OS updates via A/B partition scheme with automatic
rollback. GRUB bootloader manages slot selection with a 3-attempt
boot counter that auto-rolls back on repeated health check failures.

GRUB boot config:
- A/B slot selection with boot_counter/boot_success env vars
- Automatic rollback when counter reaches 0 (3 failed boots)
- Debug, emergency shell, and manual slot-switch menu entries

Disk image (refactored):
- 4-partition GPT layout: EFI + System A + System B + Data
- GRUB EFI/BIOS installation with graceful fallbacks
- Both system partitions populated during image creation

Update agent (Go, zero external deps):
- pkg/grubenv: read/write GRUB env vars (grub-editenv + manual fallback)
- pkg/partition: find/mount/write system partitions by label
- pkg/image: HTTP download with SHA256 verification
- pkg/health: post-boot checks (containerd, API server, node Ready)
- 6 CLI commands: check, apply, activate, rollback, healthcheck, status
- 37 unit tests across all 4 packages

Deployment:
- K8s CronJob for automatic update checks (every 6 hours)
- ConfigMap for update server URL
- Health check Job for post-boot verification

Build pipeline:
- build-update-agent.sh compiles static Linux binary (~5.9 MB)
- inject-kubesolo.sh includes update agent in initramfs
- Makefile: build-update-agent, test-update-agent, test-update targets

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 11:12:46 -06:00
d900fa920e feat: add cloud-init Go parser (Phase 2)
Implement a lightweight cloud-init system for first-boot configuration:
- Go parser for YAML config (hostname, network, KubeSolo settings)
- Static/DHCP network modes with DNS override
- KubeSolo extra flags and API server SAN configuration
- Portainer Edge Agent and air-gapped deployment support
- New init stage 45-cloud-init.sh runs before network/hostname stages
- Stages 50/60 skip gracefully when cloud-init has already applied
- Build script compiles static Linux/amd64 binary (~2.7 MB)
- 17 unit tests covering parsing, validation, and example files
- Full documentation at docs/cloud-init.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 10:39:05 -06:00
e372df578b feat: initial Phase 1 PoC scaffolding for KubeSolo OS
Complete Phase 1 implementation of KubeSolo OS — an immutable, bootable
Linux distribution built on Tiny Core Linux for running KubeSolo
single-node Kubernetes.

Build system:
- Makefile with fetch, rootfs, initramfs, iso, disk-image targets
- Dockerfile.builder for reproducible builds
- Scripts to download Tiny Core, extract rootfs, inject KubeSolo,
  pack initramfs, and create bootable ISO/disk images

Init system (10 POSIX sh stages):
- Early mount (proc/sys/dev/cgroup2), cmdline parsing, persistent
  mount with bind-mounts, kernel module loading, sysctl, DHCP
  networking, hostname, clock sync, containerd prep, KubeSolo exec

Shared libraries:
- functions.sh (device wait, IP lookup, config helpers)
- network.sh (static IP, config persistence, interface detection)
- health.sh (containerd, API server, node readiness checks)
- Emergency shell for boot failure debugging

Testing:
- QEMU boot test with serial log marker detection
- K8s readiness test with kubectl verification
- Persistence test (reboot + verify state survives)
- Workload deployment test (nginx pod)
- Local storage test (PVC + local-path provisioner)
- Network policy test
- Reusable run-vm.sh launcher

Developer tools:
- dev-vm.sh (interactive QEMU with port forwarding)
- rebuild-initramfs.sh (fast iteration)
- inject-ssh.sh (dropbear SSH for debugging)
- extract-kernel-config.sh + kernel-audit.sh

Documentation:
- Full design document with architecture research
- Boot flow documentation covering all 10 init stages
- Cloud-init examples (DHCP, static IP, Portainer Edge, air-gapped)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 10:18:42 -06:00