feat: custom kernel build + boot fixes for working container runtime
Build a custom Tiny Core 17.0 kernel (6.18.2) with missing configs that the stock kernel lacks for container workloads: - CONFIG_CGROUP_BPF=y (cgroup v2 device control via BPF) - CONFIG_DEVTMPFS=y (auto-create /dev device nodes) - CONFIG_DEVTMPFS_MOUNT=y (auto-mount devtmpfs) - CONFIG_MEMCG=y (memory cgroup controller for memory.max) - CONFIG_CFS_BANDWIDTH=y (CPU bandwidth throttling for cpu.max) Also strips unnecessary subsystems (sound, GPU, wireless, Bluetooth, KVM, etc.) for minimal footprint on a headless K8s edge appliance. Init system fixes for successful boot-to-running-pods: - Add switch_root in init.sh to escape initramfs (runc pivot_root) - Add mountpoint guards in 00-early-mount.sh (skip if already mounted) - Create essential device nodes after switch_root (kmsg, console, etc.) - Enable cgroup v2 controller delegation with init process isolation - Mount BPF filesystem for cgroup v2 device control - Add mknod fallback from sysfs in 20-persistent-mount.sh for /dev/vda - Move KubeSolo binary to /usr/bin (avoid /usr/local bind mount hiding) - Generate /etc/machine-id in 60-hostname.sh (kubelet requires it) - Pre-initialize iptables tables before kube-proxy starts - Add nft_reject, nft_fib, xt_nfacct to kernel modules list Build system changes: - New build-kernel.sh script for custom kernel compilation - Dockerfile.builder adds kernel build deps (flex, bison, libelf, etc.) - Selective kernel module install (only modules.list + transitive deps) - Install iptables-nft (xtables-nft-multi) + shared libs in rootfs Tested: ISO boots in QEMU, node reaches Ready in ~35s, CoreDNS and local-path-provisioner pods start and run successfully. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -8,11 +8,26 @@ if [ "$KUBESOLO_NOPERSIST" = "1" ]; then
|
||||
return 0
|
||||
fi
|
||||
|
||||
# Load block device drivers before waiting (modules loaded later in stage 30,
|
||||
# but we need virtio_blk available NOW for /dev/vda detection)
|
||||
modprobe virtio_blk 2>/dev/null || true
|
||||
# Trigger mdev to create device nodes after loading driver
|
||||
mdev -s 2>/dev/null || true
|
||||
|
||||
# Fallback: create device node from sysfs if devtmpfs/mdev didn't
|
||||
DEV_NAME="${KUBESOLO_DATA_DEV##*/}"
|
||||
if [ ! -b "$KUBESOLO_DATA_DEV" ] && [ -f "/sys/class/block/$DEV_NAME/dev" ]; then
|
||||
MAJMIN=$(cat "/sys/class/block/$DEV_NAME/dev")
|
||||
mknod "$KUBESOLO_DATA_DEV" b "${MAJMIN%%:*}" "${MAJMIN##*:}" 2>/dev/null || true
|
||||
log "Created $KUBESOLO_DATA_DEV via mknod ($MAJMIN)"
|
||||
fi
|
||||
|
||||
# Wait for device to appear (USB, slow disks, virtio)
|
||||
log "Waiting for data device: $KUBESOLO_DATA_DEV"
|
||||
WAIT_SECS=30
|
||||
for i in $(seq 1 "$WAIT_SECS"); do
|
||||
[ -b "$KUBESOLO_DATA_DEV" ] && break
|
||||
mdev -s 2>/dev/null || true
|
||||
sleep 1
|
||||
done
|
||||
|
||||
|
||||
Reference in New Issue
Block a user