Files
kubesolo-os/init/lib/90-kubesolo.sh
Adolfo Delorenzo 51c1f78aea
Some checks failed
ARM64 Build / Build generic ARM64 disk image (push) Failing after 5s
CI / Go Tests (push) Successful in 1m55s
CI / Shellcheck (push) Successful in 53s
CI / Build Go Binaries (amd64, linux, linux-amd64) (push) Failing after 1m0s
CI / Build Go Binaries (arm64, linux, linux-arm64) (push) Successful in 2m18s
fix(arm64): bundle nft binary + always show access banner
Two real v0.3.0 bugs that surface on first-boot:

1. KubeSolo v1.1.4+ owns its pod-masquerade rules directly via
     nft add table ip kubesolo-masq
   instead of going through kube-proxy/CNI. Without the standalone nft
   CLI in PATH, KubeSolo FATALs at startup with:
     "nft": executable file not found in $PATH
   then the init exits and the kernel panics on PID 1 death.

   inject-kubesolo.sh now also copies /usr/sbin/nft and its non-shared
   libraries (libnftables, libedit, libjansson, libgmp, libtinfo, libbsd,
   libmd). The iptables-nft block above already covered libmnl, libnftnl,
   libxtables, libc, ld.

2. The host-access banner ("From your host machine, run: curl -s
   http://localhost:8080 ...") was gated on the kubeconfig appearing
   within 120s. When KubeSolo crashed early (bug 1 above) or simply took
   longer than the wait window, the user never saw the connection
   instructions.

   90-kubesolo.sh now:
     - writes the banner to /etc/motd so it shows on any later shell
       (SSH ext, emergency shell, console login)
     - prints the banner to console unconditionally, after the wait
       loop, regardless of whether the kubeconfig was found

Both fixes are pure rootfs changes — no kernel rebuild required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 07:16:12 -06:00

135 lines
4.8 KiB
Bash
Executable File

#!/bin/sh
# 90-kubesolo.sh — Start KubeSolo (final init stage)
#
# Starts KubeSolo, waits for it to become ready, then prints the kubeconfig
# to the console so it can be copied for remote kubectl access.
KUBESOLO_BIN="/usr/bin/kubesolo"
if [ ! -x "$KUBESOLO_BIN" ]; then
log_err "KubeSolo binary not found at $KUBESOLO_BIN"
return 1
fi
# Build KubeSolo command line
KUBESOLO_ARGS="--path /var/lib/kubesolo --local-storage"
# Add SANs for remote access (127.0.0.1 for QEMU port forwarding, 10.0.2.15 for QEMU NAT)
EXTRA_SANS="127.0.0.1,10.0.2.15"
HOSTNAME="$(hostname)"
if [ -n "$HOSTNAME" ]; then
EXTRA_SANS="$EXTRA_SANS,$HOSTNAME"
fi
KUBESOLO_ARGS="$KUBESOLO_ARGS --apiserver-extra-sans $EXTRA_SANS"
# Add any extra flags from boot parameters
if [ -n "$KUBESOLO_EXTRA_FLAGS" ]; then
KUBESOLO_ARGS="$KUBESOLO_ARGS $KUBESOLO_EXTRA_FLAGS"
fi
# Add flags from persistent config file
if [ -f /etc/kubesolo/extra-flags ]; then
KUBESOLO_ARGS="$KUBESOLO_ARGS $(cat /etc/kubesolo/extra-flags)"
fi
# Pre-initialize iptables filter table and base chains.
# KubeSolo's kube-proxy uses iptables-restore (nf_tables backend) which needs
# the filter table to exist. Without this, the first iptables-restore fails
# with "RULE_APPEND failed (No such file or directory)".
if command -v iptables >/dev/null 2>&1; then
iptables -t filter -L -n >/dev/null 2>&1 || true
iptables -t nat -L -n >/dev/null 2>&1 || true
iptables -t mangle -L -n >/dev/null 2>&1 || true
log "Pre-initialized iptables tables (filter, nat, mangle)"
fi
# Export Portainer Edge env vars if set (via boot params or cloud-init)
if [ -n "${KUBESOLO_PORTAINER_EDGE_ID:-}" ]; then
export KUBESOLO_PORTAINER_EDGE_ID
log "Portainer Edge ID configured"
fi
if [ -n "${KUBESOLO_PORTAINER_EDGE_KEY:-}" ]; then
export KUBESOLO_PORTAINER_EDGE_KEY
log "Portainer Edge Key configured"
fi
log "Starting KubeSolo: $KUBESOLO_BIN $KUBESOLO_ARGS"
KUBECONFIG_PATH="/var/lib/kubesolo/pki/admin/admin.kubeconfig"
# Start KubeSolo in background so we can wait for readiness and print kubeconfig
# shellcheck disable=SC2086
$KUBESOLO_BIN $KUBESOLO_ARGS &
KUBESOLO_PID=$!
# Wait for kubeconfig to appear (KubeSolo generates it during startup)
log "Waiting for KubeSolo to generate kubeconfig..."
WAIT=0
while [ ! -f "$KUBECONFIG_PATH" ] && [ $WAIT -lt 120 ]; do
sleep 2
WAIT=$((WAIT + 2))
# Check KubeSolo is still running
if ! kill -0 $KUBESOLO_PID 2>/dev/null; then
log_err "KubeSolo exited unexpectedly"
wait $KUBESOLO_PID 2>/dev/null || true
return 1
fi
done
# Render the access banner. Written to /etc/motd so it's visible to anyone
# who later shells in (SSH extension, emergency shell, console login), and
# printed unconditionally to console below so the user sees it even when
# KubeSolo hasn't yet finished generating the kubeconfig.
ACCESS_BANNER="$(cat <<'BANNER'
============================================================
KubeSolo OS — host access
From your host machine, run:
curl -s http://localhost:8080 > ~/.kube/kubesolo-config
kubectl --kubeconfig ~/.kube/kubesolo-config get nodes
Notes:
- port 8080 serves the kubeconfig (admin) over HTTP
- port 6443 serves the Kubernetes API (HTTPS)
- Both ports are forwarded under QEMU's `-net user,hostfwd=…` config
============================================================
BANNER
)"
printf '%s\n' "$ACCESS_BANNER" > /etc/motd 2>/dev/null || true
if [ -f "$KUBECONFIG_PATH" ]; then
log_ok "KubeSolo is running (PID $KUBESOLO_PID)"
# Rewrite server URL for external access and serve via HTTP.
# Serial console truncates long base64 cert lines, so we serve
# the kubeconfig over HTTP for reliable retrieval.
EXTERNAL_KC="/tmp/kubeconfig-external.yaml"
sed 's|server: https://.*:6443|server: https://localhost:6443|' "$KUBECONFIG_PATH" > "$EXTERNAL_KC"
# Serve kubeconfig via HTTP on port 8080 for remote kubectl access.
# Binds to 0.0.0.0 so it's reachable via QEMU port forwarding.
# Security: the kubeconfig is only useful if you can also reach
# port 6443 (API server). On edge devices, network isolation
# provides the security boundary.
(while true; do
printf 'HTTP/1.1 200 OK\r\nContent-Type: text/yaml\r\nConnection: close\r\n\r\n' | cat - "$EXTERNAL_KC" | nc -l -p 8080 2>/dev/null
done) &
log_ok "Kubeconfig available via HTTP on port 8080"
else
log_warn "Kubeconfig not found after ${WAIT}s — KubeSolo may still be starting"
log_warn "Check manually: cat $KUBECONFIG_PATH"
fi
# Show the banner regardless of kubeconfig state: the HTTP server above only
# starts on success, but printing the instructions during the long first-boot
# wait is useful and harmless (user retries the curl until it 200s).
echo ""
printf '%s\n' "$ACCESS_BANNER"
echo ""
# Keep init alive — wait on KubeSolo process
wait $KUBESOLO_PID