feat: add production hardening — Ed25519 signing, Portainer Edge, SSH extension (Phase 4)
Image signing: - Ed25519 sign/verify package (pure Go stdlib, zero deps) - genkey and sign CLI subcommands for build system - Optional --pubkey flag for verifying updates on apply - Signature URLs in update metadata (latest.json) Portainer Edge Agent: - cloud-init portainer.go module writes K8s manifest - Auto-deploys Edge Agent when portainer.edge-agent.enabled - Full RBAC (ServiceAccount, ClusterRoleBinding, Deployment) - 5 Portainer tests in portainer_test.go Production tooling: - SSH debug extension builder (hack/build-ssh-extension.sh) - Boot performance benchmark (test/benchmark/bench-boot.sh) - Resource usage benchmark (test/benchmark/bench-resources.sh) - Deployment guide (docs/deployment-guide.md) Test results: 50 update agent tests + 22 cloud-init tests passing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
450
docs/deployment-guide.md
Normal file
450
docs/deployment-guide.md
Normal file
@@ -0,0 +1,450 @@
|
||||
# KubeSolo OS — Deployment Guide
|
||||
|
||||
This guide covers deploying KubeSolo OS to physical hardware and virtual machines,
|
||||
including first-boot configuration, update signing, and Portainer Edge integration.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Prerequisites](#prerequisites)
|
||||
- [Building](#building)
|
||||
- [Installation Methods](#installation-methods)
|
||||
- [First-Boot Configuration (Cloud-Init)](#first-boot-configuration)
|
||||
- [Update Signing](#update-signing)
|
||||
- [Portainer Edge Integration](#portainer-edge-integration)
|
||||
- [SSH Debug Access](#ssh-debug-access)
|
||||
- [Monitoring and Health Checks](#monitoring-and-health-checks)
|
||||
- [Troubleshooting](#troubleshooting)
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
**Hardware requirements:**
|
||||
- x86_64 processor
|
||||
- 512 MB RAM minimum (1 GB recommended)
|
||||
- 8 GB storage minimum (16 GB recommended)
|
||||
- Network interface (wired or WiFi with supported chipset)
|
||||
|
||||
**Build requirements:**
|
||||
- Linux or macOS host
|
||||
- Docker (for reproducible builds) or: bash, make, cpio, gzip, xorriso, Go 1.22+
|
||||
- QEMU (for testing)
|
||||
|
||||
---
|
||||
|
||||
## Building
|
||||
|
||||
### Quick build (ISO)
|
||||
|
||||
```bash
|
||||
git clone https://github.com/portainer/kubesolo-os.git
|
||||
cd kubesolo-os
|
||||
make fetch # Download Tiny Core + KubeSolo
|
||||
make iso # Build bootable ISO
|
||||
```
|
||||
|
||||
Output: `output/kubesolo-os-<version>.iso`
|
||||
|
||||
### Disk image (for persistent installations)
|
||||
|
||||
```bash
|
||||
make disk-image # Build raw disk with A/B partitions
|
||||
```
|
||||
|
||||
Output: `output/kubesolo-os-<version>.img`
|
||||
|
||||
### Reproducible build (Docker)
|
||||
|
||||
```bash
|
||||
make docker-build
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Installation Methods
|
||||
|
||||
### USB Flash Drive
|
||||
|
||||
```bash
|
||||
# Write disk image to USB (replace /dev/sdX with your device)
|
||||
sudo dd if=output/kubesolo-os-0.1.0.img of=/dev/sdX bs=4M status=progress
|
||||
sync
|
||||
```
|
||||
|
||||
### Virtual Machine (QEMU/KVM)
|
||||
|
||||
```bash
|
||||
# Quick launch for testing
|
||||
make dev-vm
|
||||
|
||||
# Or manually:
|
||||
qemu-system-x86_64 -m 1024 -smp 2 \
|
||||
-enable-kvm -cpu host \
|
||||
-drive file=output/kubesolo-os-0.1.0.img,format=raw,if=virtio \
|
||||
-net nic,model=virtio \
|
||||
-net user,hostfwd=tcp::6443-:6443,hostfwd=tcp::2222-:22 \
|
||||
-nographic
|
||||
```
|
||||
|
||||
### Cloud / Hypervisor
|
||||
|
||||
Convert the raw image for your platform:
|
||||
|
||||
```bash
|
||||
# VMware
|
||||
qemu-img convert -f raw -O vmdk output/kubesolo-os-0.1.0.img kubesolo-os.vmdk
|
||||
|
||||
# VirtualBox
|
||||
qemu-img convert -f raw -O vdi output/kubesolo-os-0.1.0.img kubesolo-os.vdi
|
||||
|
||||
# Hyper-V
|
||||
qemu-img convert -f raw -O vhdx output/kubesolo-os-0.1.0.img kubesolo-os.vhdx
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## First-Boot Configuration
|
||||
|
||||
KubeSolo OS uses a simplified cloud-init system for first-boot configuration.
|
||||
Place the config file on the data partition before first boot.
|
||||
|
||||
### Config file location
|
||||
|
||||
```
|
||||
/mnt/data/etc-kubesolo/cloud-init.yaml
|
||||
```
|
||||
|
||||
For ISO boot, the config can be provided via a secondary drive or kernel parameter:
|
||||
```
|
||||
kubesolo.cloudinit=/path/to/cloud-init.yaml
|
||||
```
|
||||
|
||||
### Basic DHCP configuration
|
||||
|
||||
```yaml
|
||||
hostname: kubesolo-node-01
|
||||
|
||||
network:
|
||||
mode: dhcp
|
||||
|
||||
kubesolo:
|
||||
local-storage: true
|
||||
```
|
||||
|
||||
### Static IP configuration
|
||||
|
||||
```yaml
|
||||
hostname: kubesolo-prod-01
|
||||
|
||||
network:
|
||||
mode: static
|
||||
interface: eth0
|
||||
address: 192.168.1.100/24
|
||||
gateway: 192.168.1.1
|
||||
dns:
|
||||
- 8.8.8.8
|
||||
- 1.1.1.1
|
||||
|
||||
kubesolo:
|
||||
local-storage: true
|
||||
apiserver-extra-sans:
|
||||
- 192.168.1.100
|
||||
- kubesolo-prod-01.local
|
||||
```
|
||||
|
||||
### Air-gapped deployment
|
||||
|
||||
```yaml
|
||||
hostname: airgap-node
|
||||
|
||||
network:
|
||||
mode: static
|
||||
address: 10.0.0.50/24
|
||||
gateway: 10.0.0.1
|
||||
dns:
|
||||
- 10.0.0.1
|
||||
|
||||
kubesolo:
|
||||
local-storage: true
|
||||
extra-flags: "--disable=traefik --disable=servicelb"
|
||||
|
||||
airgap:
|
||||
import-images: true
|
||||
images-dir: /mnt/data/images
|
||||
```
|
||||
|
||||
Pre-load container images by placing tar archives in `/mnt/data/images/`.
|
||||
|
||||
---
|
||||
|
||||
## Update Signing
|
||||
|
||||
KubeSolo OS supports Ed25519 signature verification for update images.
|
||||
This ensures only authorized images can be applied to your devices.
|
||||
|
||||
### Generate a signing key pair
|
||||
|
||||
```bash
|
||||
# On your build machine (keep private key secure!)
|
||||
cd update && go run . genkey
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
Public key (hex): <64-char hex string>
|
||||
Private key (hex): <128-char hex string>
|
||||
|
||||
Save the public key to /etc/kubesolo/update-pubkey.hex on the device.
|
||||
Keep the private key secure and offline - use it only for signing updates.
|
||||
```
|
||||
|
||||
Save the private key to a secure location (e.g., `signing-key.hex`).
|
||||
Save the public key to `update-pubkey.hex`.
|
||||
|
||||
### Sign update images
|
||||
|
||||
```bash
|
||||
# Sign the kernel and initramfs
|
||||
cd update && go run . sign --key /path/to/signing-key.hex \
|
||||
../output/vmlinuz ../output/kubesolo-os.gz
|
||||
```
|
||||
|
||||
This produces `.sig` files alongside each image.
|
||||
|
||||
### Deploy the public key
|
||||
|
||||
Place the public key on the device's data partition:
|
||||
```
|
||||
/mnt/data/etc-kubesolo/update-pubkey.hex
|
||||
```
|
||||
|
||||
Or embed it in the cloud-init config on the data partition.
|
||||
|
||||
### Update server layout
|
||||
|
||||
Your update server should serve:
|
||||
```
|
||||
/latest.json # Update metadata
|
||||
/vmlinuz # Kernel
|
||||
/vmlinuz.sig # Kernel signature
|
||||
/kubesolo-os.gz # Initramfs
|
||||
/kubesolo-os.gz.sig # Initramfs signature
|
||||
```
|
||||
|
||||
Example `latest.json`:
|
||||
```json
|
||||
{
|
||||
"version": "0.2.0",
|
||||
"vmlinuz_url": "https://updates.example.com/v0.2.0/vmlinuz",
|
||||
"vmlinuz_sha256": "<sha256-hex>",
|
||||
"vmlinuz_sig_url": "https://updates.example.com/v0.2.0/vmlinuz.sig",
|
||||
"initramfs_url": "https://updates.example.com/v0.2.0/kubesolo-os.gz",
|
||||
"initramfs_sha256": "<sha256-hex>",
|
||||
"initramfs_sig_url": "https://updates.example.com/v0.2.0/kubesolo-os.gz.sig",
|
||||
"release_notes": "Bug fixes and security updates",
|
||||
"release_date": "2025-01-15"
|
||||
}
|
||||
```
|
||||
|
||||
### Apply a signed update
|
||||
|
||||
```bash
|
||||
kubesolo-update apply \
|
||||
--server https://updates.example.com \
|
||||
--pubkey /etc/kubesolo/update-pubkey.hex
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Portainer Edge Integration
|
||||
|
||||
KubeSolo OS can automatically deploy the Portainer Edge Agent for remote
|
||||
management through Portainer Business Edition.
|
||||
|
||||
### Setup in Portainer
|
||||
|
||||
1. Log in to your Portainer Business instance
|
||||
2. Go to **Environments** → **Add Environment** → **Edge Agent**
|
||||
3. Select **Kubernetes** as the environment type
|
||||
4. Copy the **Edge ID** and **Edge Key** values
|
||||
|
||||
### Cloud-init configuration
|
||||
|
||||
```yaml
|
||||
hostname: edge-node-01
|
||||
|
||||
network:
|
||||
mode: dhcp
|
||||
|
||||
kubesolo:
|
||||
local-storage: true
|
||||
|
||||
portainer:
|
||||
edge-agent:
|
||||
enabled: true
|
||||
edge-id: "your-edge-id-from-portainer"
|
||||
edge-key: "your-edge-key-from-portainer"
|
||||
portainer-url: "https://portainer.yourcompany.com"
|
||||
# Optional: pin agent version
|
||||
# image: portainer/agent:2.20.0
|
||||
```
|
||||
|
||||
### Manual deployment
|
||||
|
||||
If not using cloud-init, deploy the Edge Agent manually after boot:
|
||||
|
||||
```bash
|
||||
# Create namespace
|
||||
kubesolo kubectl create namespace portainer
|
||||
|
||||
# Apply the edge agent manifest (generated from template)
|
||||
kubesolo kubectl apply -f /path/to/portainer-edge-agent.yaml
|
||||
```
|
||||
|
||||
### Verify connection
|
||||
|
||||
```bash
|
||||
kubesolo kubectl -n portainer get pods
|
||||
# Should show portainer-agent pod in Running state
|
||||
```
|
||||
|
||||
The node should appear in your Portainer dashboard within a few minutes.
|
||||
|
||||
---
|
||||
|
||||
## SSH Debug Access
|
||||
|
||||
For development and debugging, you can add SSH access using the
|
||||
optional ssh-debug extension.
|
||||
|
||||
### Build the SSH extension
|
||||
|
||||
```bash
|
||||
./hack/build-ssh-extension.sh --pubkey ~/.ssh/id_ed25519.pub
|
||||
```
|
||||
|
||||
### Load on a running system
|
||||
|
||||
```bash
|
||||
# Copy to device
|
||||
scp output/ssh-debug.tcz root@<device>:/mnt/data/extensions/
|
||||
|
||||
# Load (no reboot required)
|
||||
unsquashfs -f -d / /mnt/data/extensions/ssh-debug.tcz
|
||||
/usr/lib/kubesolo-os/init.d/85-ssh.sh
|
||||
```
|
||||
|
||||
### Quick inject for development
|
||||
|
||||
```bash
|
||||
# Inject into rootfs before building ISO
|
||||
./hack/inject-ssh.sh
|
||||
make initramfs iso
|
||||
```
|
||||
|
||||
> **Warning:** SSH access should NEVER be enabled in production. The debug
|
||||
> extension uses key-based auth only and has no password, but it still
|
||||
> expands the attack surface.
|
||||
|
||||
---
|
||||
|
||||
## Monitoring and Health Checks
|
||||
|
||||
### Automatic health checks
|
||||
|
||||
KubeSolo OS runs a post-boot health check that verifies:
|
||||
- containerd is running
|
||||
- Kubernetes API server responds
|
||||
- Node reports Ready status
|
||||
|
||||
On success, the health check marks the boot as successful in GRUB,
|
||||
preventing automatic rollback.
|
||||
|
||||
### Deploy the health check CronJob
|
||||
|
||||
```bash
|
||||
kubesolo kubectl apply -f update/deploy/update-cronjob.yaml
|
||||
```
|
||||
|
||||
This deploys:
|
||||
- A CronJob that checks for updates every 6 hours
|
||||
- A health check Job that runs at boot
|
||||
|
||||
### Manual health check
|
||||
|
||||
```bash
|
||||
kubesolo-update healthcheck --timeout 120
|
||||
```
|
||||
|
||||
### Status check
|
||||
|
||||
```bash
|
||||
kubesolo-update status
|
||||
```
|
||||
|
||||
Shows:
|
||||
- Active/passive slot
|
||||
- Current version
|
||||
- Boot counter status
|
||||
- GRUB environment variables
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Boot hangs at kernel loading
|
||||
|
||||
- Verify the ISO/image is not corrupted: check SHA256 against published hash
|
||||
- Try adding `kubesolo.debug` to kernel command line for verbose logging
|
||||
- Try `kubesolo.shell` to drop to emergency shell
|
||||
|
||||
### KubeSolo fails to start
|
||||
|
||||
```bash
|
||||
# Check KubeSolo logs
|
||||
cat /var/log/kubesolo.log
|
||||
|
||||
# Verify containerd is running
|
||||
pidof containerd
|
||||
|
||||
# Check if required kernel modules are loaded
|
||||
lsmod | grep -E "overlay|br_netfilter|veth"
|
||||
```
|
||||
|
||||
### Node not reaching Ready state
|
||||
|
||||
```bash
|
||||
# Check node status
|
||||
kubesolo kubectl get nodes -o wide
|
||||
|
||||
# Check system pods
|
||||
kubesolo kubectl get pods -A
|
||||
|
||||
# Check kubelet logs
|
||||
kubesolo kubectl logs -n kube-system <pod-name>
|
||||
```
|
||||
|
||||
### Update fails with signature error
|
||||
|
||||
```bash
|
||||
# Verify the public key matches the signing key
|
||||
cat /etc/kubesolo/update-pubkey.hex
|
||||
|
||||
# Test signature verification manually
|
||||
kubesolo-update check --server https://updates.example.com
|
||||
```
|
||||
|
||||
### Rollback to previous version
|
||||
|
||||
```bash
|
||||
# Force rollback to the other slot
|
||||
kubesolo-update rollback --grubenv /boot/grub/grubenv
|
||||
|
||||
# Reboot to apply
|
||||
reboot
|
||||
```
|
||||
|
||||
### Emergency shell
|
||||
|
||||
Boot with `kubesolo.shell` kernel parameter, or if boot fails after 3
|
||||
attempts, GRUB automatically rolls back to the last known good slot.
|
||||
Reference in New Issue
Block a user