K3s stability improvements

Two Months of K3s Stability Improvements

TL;DR Over the past two months, I have made a series of stability improvements to my k3s homelab cluster. The biggest wins: migrating from AWS ECR to self-hosted Harbor (eliminating 12-hour token expiry), fixing recurring Grafana crashes caused by SQLite corruption on Longhorn, recovering pve4 after a failed LXC experiment, hardening NetworkPolicies to close gaps in pod-to-host traffic rules, and patching multiple CVEs across the media stack. The cluster now runs 7/7 nodes on k3s v1.34.4, all services monitored, all images pulled from Harbor with static credentials that never expire. ...

March 27, 2026 · 8 min · zolty
VPN tech collective mesh

Building a VPN Mesh for a Tech Collective

TL;DR I am designing a WireGuard VPN mesh to connect a small tech collective – a group of friends who each run their own infrastructure. The topology is hub-and-spoke with my k3s cluster as the hub, connecting 4+ remote sites over encrypted tunnels. Shared services include Jellyfin media federation, distributed CI/CD runners, LAN gaming, and centralized monitoring. The logging pipeline is privacy-first: all log filtering and anonymization happens at the edge (spoke side) before anything ships to the hub. This post covers the network design, the three-layer firewall architecture, the privacy model, and the phased rollout plan. ...

March 27, 2026 · 8 min · zolty
10GbE networking upgrade

10GbE Networking on a Budget: Mellanox ConnectX-3 and Bricked NICs

TL;DR I upgraded the inter-node networking from 1GbE to 10GbE using Mellanox ConnectX-3 NICs from eBay (~$15-20 each). Two of the three NICs worked immediately. The third needed a firmware flash from 2.33.5220 to 2.42.5000 using mstflint, which required two cold boots to recover from a partially bricked state. All three Proxmox hosts now have 10GbE connectivity via active-backup bonds. I also deployed Security Scanner infrastructure and fixed ARC runner Pod Security Standards. ...

February 20, 2026 · 6 min · zolty
VLAN migration

VLAN Migration: Moving a Live Kubernetes Cluster Without Downtime

TL;DR Today was the biggest infrastructure day yet. I migrated the entire k3s cluster from a flat network to a proper VLAN architecture: Server VLAN 20 for k3s nodes and services, Storage VLAN 30 for the NAS, and the existing default VLAN 1 for clients. This involved changing IPs on all VMs, updating MetalLB, reconfiguring Traefik, and recovering from an etcd quorum loss when I moved too many nodes at once. I also deployed the media stack (Jellyfin, Radarr, Sonarr, Prowlarr, Jellyseerr) and configured Intel iGPU passthrough infrastructure. ...

February 16, 2026 · 6 min · zolty

Affiliate Disclosure: Some links on this site are affiliate links (Amazon Associates, DigitalOcean referral). As an Amazon Associate, I earn from qualifying purchases. This does not affect the price you pay or my editorial independence — I only recommend products and services I personally use and trust.