Commit Graph

9 Commits

Author SHA1 Message Date
3b47e7af68 release: v0.3.0
Some checks failed
CI / Go Tests (push) Successful in 1m29s
CI / Shellcheck (push) Successful in 46s
ARM64 Build / Build generic ARM64 disk image (push) Failing after 3s
Release / Test (push) Successful in 1m21s
CI / Build Go Binaries (amd64, linux, linux-amd64) (push) Successful in 1m19s
CI / Build Go Binaries (arm64, linux, linux-arm64) (push) Successful in 1m36s
Release / Build Binaries (amd64, linux, linux-amd64) (push) Failing after 1m27s
Release / Build Binaries (arm64, linux, linux-arm64) (push) Failing after 1m17s
Release / Build ISO (amd64) (push) Has been skipped
Release / Create Release (push) Has been skipped
Promote VERSION from 0.3.0-dev to 0.3.0. Finalise CHANGELOG entry with
phases 5-8 work (state machine + metrics, channels + maintenance windows,
OCI multi-arch distribution, pre-flight gates + deeper healthcheck +
auto-rollback). Refresh README quick-start to show both x86_64 and generic
ARM64 paths; update the roadmap status table to mark all v0.3 phases
complete and explicitly track the v0.3.1 follow-ups (OCI cosign,
LABEL=KSOLODATA on ARM64, real-hardware validation).

Add docs/release-notes-0.3.0.md as the operator-facing summary, including a
v0.2.x -> v0.3.0 migration section (non-breaking on live systems) and the
known-limitations list copied from CHANGELOG.

All tests green: cloud-init module, all 10 update-module packages,
shellcheck across init / build / test / hack scripts under the v0.3
severity policy.

Tagging is intentionally NOT done from this commit — that's a manual step
so the operator can decide when v0.3.0 is final. After tagging:

  git tag -a v0.3.0 -m "KubeSolo OS v0.3.0"
  git push origin v0.3.0

The push triggers .gitea/workflows/build-arm64.yaml which runs the full
ARM64 build on the Odroid runner.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 19:13:09 -06:00
de10de0ef3 chore(arm64): clean up debug logging + document Phase 3 status
Some checks failed
CI / Go Tests (push) Successful in 1m46s
CI / Shellcheck (push) Failing after 38s
CI / Build Go Binaries (amd64, linux, linux-amd64) (push) Failing after 1m19s
CI / Build Go Binaries (arm64, linux, linux-arm64) (push) Failing after 1m16s
Remove [KSOLO-DBG] per-step echos from init.sh. The /dev/console redirect
stays — it's load-bearing for early-boot visibility on QEMU virt.

Add docs/arm64-status.md capturing the end-of-Phase-3 state:
  - What works (full boot through 14 stages, KubeSolo + containerd start)
  - Known limitations of the dev setup (QEMU TCG perf, /dev/vda4 hardcode,
    busybox-static gaps)
  - What's needed to ship v0.3 ARM64 as production-ready

Real-hardware validation (Graviton, Ampere, or similar) is the next gating
step before we can call ARM64 generic done.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 16:19:16 -06:00
19b99cf101 docs: define generic ARM64 vs RPi build-track architecture
Phase 1 audit finding: existing ARM64 build code is mostly already generic.
Only build-kernel-arm64.sh and rpi-kernel-config.fragment are misnamed (the
former is RPi-only, the latter is actually arch-agnostic). The QEMU virt
harness, modules-arm64.list, extract-core arm64 branch, and inject-kubesolo
arm64 branch are all generic.

This document records the target two-track layout for v0.3.0:
- Generic ARM64: mainline kernel, UEFI, GRUB, virtio, GPT 4-part image
- Raspberry Pi: raspberrypi/linux fork, autoboot.txt, MBR 4-part image
- Shared: init, cloud-init, update agent, modules list, kernel-container fragment

Phases 2 and 3 will execute the migration (rename build-kernel-arm64.sh ->
build-kernel-rpi.sh, write a new mainline build-kernel-arm64.sh, etc.).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 10:02:29 -06:00
059ec7955f chore: housekeeping for v0.3 prep
- Pin KUBESOLO_VERSION in versions.env (was soft-defaulted in fetch-components.sh)
- Gitignore screenshots, macOS resource forks, and common image extensions
- Update README roadmap: x86_64 stable, ARM64 generic in progress (v0.3),
  ARM64 RPi paused pending hardware
- Add docs/ci-runners.md documenting the Odroid arm64-linux Gitea runner

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 09:44:01 -06:00
61bd28c692 feat: cloud-init supports all documented KubeSolo CLI flags
Some checks failed
CI / Go Tests (push) Has been cancelled
CI / Build Go Binaries (amd64, linux, linux-amd64) (push) Has been cancelled
CI / Build Go Binaries (arm64, linux, linux-arm64) (push) Has been cancelled
CI / Shellcheck (push) Has been cancelled
Add missing flags (--local-storage-shared-path, --debug, --pprof-server,
--portainer-edge-id, --portainer-edge-key, --portainer-edge-async) so all
10 documented KubeSolo parameters can be configured via cloud-init YAML.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 15:49:31 -06:00
49a37e30e8 feat: add production hardening — Ed25519 signing, Portainer Edge, SSH extension (Phase 4)
Image signing:
- Ed25519 sign/verify package (pure Go stdlib, zero deps)
- genkey and sign CLI subcommands for build system
- Optional --pubkey flag for verifying updates on apply
- Signature URLs in update metadata (latest.json)

Portainer Edge Agent:
- cloud-init portainer.go module writes K8s manifest
- Auto-deploys Edge Agent when portainer.edge-agent.enabled
- Full RBAC (ServiceAccount, ClusterRoleBinding, Deployment)
- 5 Portainer tests in portainer_test.go

Production tooling:
- SSH debug extension builder (hack/build-ssh-extension.sh)
- Boot performance benchmark (test/benchmark/bench-boot.sh)
- Resource usage benchmark (test/benchmark/bench-resources.sh)
- Deployment guide (docs/deployment-guide.md)

Test results: 50 update agent tests + 22 cloud-init tests passing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 11:26:23 -06:00
8d25e1890e feat: add A/B partition updates with GRUB and Go update agent (Phase 3)
Implement atomic OS updates via A/B partition scheme with automatic
rollback. GRUB bootloader manages slot selection with a 3-attempt
boot counter that auto-rolls back on repeated health check failures.

GRUB boot config:
- A/B slot selection with boot_counter/boot_success env vars
- Automatic rollback when counter reaches 0 (3 failed boots)
- Debug, emergency shell, and manual slot-switch menu entries

Disk image (refactored):
- 4-partition GPT layout: EFI + System A + System B + Data
- GRUB EFI/BIOS installation with graceful fallbacks
- Both system partitions populated during image creation

Update agent (Go, zero external deps):
- pkg/grubenv: read/write GRUB env vars (grub-editenv + manual fallback)
- pkg/partition: find/mount/write system partitions by label
- pkg/image: HTTP download with SHA256 verification
- pkg/health: post-boot checks (containerd, API server, node Ready)
- 6 CLI commands: check, apply, activate, rollback, healthcheck, status
- 37 unit tests across all 4 packages

Deployment:
- K8s CronJob for automatic update checks (every 6 hours)
- ConfigMap for update server URL
- Health check Job for post-boot verification

Build pipeline:
- build-update-agent.sh compiles static Linux binary (~5.9 MB)
- inject-kubesolo.sh includes update agent in initramfs
- Makefile: build-update-agent, test-update-agent, test-update targets

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 11:12:46 -06:00
d900fa920e feat: add cloud-init Go parser (Phase 2)
Implement a lightweight cloud-init system for first-boot configuration:
- Go parser for YAML config (hostname, network, KubeSolo settings)
- Static/DHCP network modes with DNS override
- KubeSolo extra flags and API server SAN configuration
- Portainer Edge Agent and air-gapped deployment support
- New init stage 45-cloud-init.sh runs before network/hostname stages
- Stages 50/60 skip gracefully when cloud-init has already applied
- Build script compiles static Linux/amd64 binary (~2.7 MB)
- 17 unit tests covering parsing, validation, and example files
- Full documentation at docs/cloud-init.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 10:39:05 -06:00
e372df578b feat: initial Phase 1 PoC scaffolding for KubeSolo OS
Complete Phase 1 implementation of KubeSolo OS — an immutable, bootable
Linux distribution built on Tiny Core Linux for running KubeSolo
single-node Kubernetes.

Build system:
- Makefile with fetch, rootfs, initramfs, iso, disk-image targets
- Dockerfile.builder for reproducible builds
- Scripts to download Tiny Core, extract rootfs, inject KubeSolo,
  pack initramfs, and create bootable ISO/disk images

Init system (10 POSIX sh stages):
- Early mount (proc/sys/dev/cgroup2), cmdline parsing, persistent
  mount with bind-mounts, kernel module loading, sysctl, DHCP
  networking, hostname, clock sync, containerd prep, KubeSolo exec

Shared libraries:
- functions.sh (device wait, IP lookup, config helpers)
- network.sh (static IP, config persistence, interface detection)
- health.sh (containerd, API server, node readiness checks)
- Emergency shell for boot failure debugging

Testing:
- QEMU boot test with serial log marker detection
- K8s readiness test with kubectl verification
- Persistence test (reboot + verify state survives)
- Workload deployment test (nginx pod)
- Local storage test (PVC + local-path provisioner)
- Network policy test
- Reusable run-vm.sh launcher

Developer tools:
- dev-vm.sh (interactive QEMU with port forwarding)
- rebuild-initramfs.sh (fast iteration)
- inject-ssh.sh (dropbear SSH for debugging)
- extract-kernel-config.sh + kernel-audit.sh

Documentation:
- Full design document with architecture research
- Boot flow documentation covering all 10 init stages
- Cloud-init examples (DHCP, static IP, Portainer Edge, air-gapped)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 10:18:42 -06:00