Infrastructure

10GbE Networking on a Budget: Mellanox ConnectX-3 and Bricked NICs

TL;DR I upgraded the inter-node networking from 1GbE to 10GbE using Mellanox ConnectX-3 NICs from eBay (~$15-20 each). Two of the three NICs worked immediately. The third needed a firmware flash from 2.33.5220 to 2.42.5000 using mstflint, which required two cold boots to recover from a partially bricked state. All three Proxmox hosts now have 10GbE connectivity via active-backup bonds. I also deployed Security Scanner infrastructure and fixed ARC runner Pod Security Standards. ...

GPU Passthrough on Proxmox for Hardware Transcoding

TL;DR Intel iGPU passthrough from Proxmox to a k3s VM enables hardware video transcoding with minimal CPU overhead. This guide covers the complete process: enabling IOMMU, configuring VFIO, blacklisting the i915 driver, rebuilding the VM with q35/OVMF, and verifying VA-API inside the VM. The Intel UHD 630 in the M920q handles H.264 and HEVC encode/decode at real-time speeds. Why GPU Passthrough? Software transcoding (libx264/libx265) on a 6-core i5-8500T is limited: ...

VLAN Migration: Moving a Live Kubernetes Cluster Without Downtime

TL;DR Today was the biggest infrastructure day yet. I migrated the entire k3s cluster from a flat network to a proper VLAN architecture: Server VLAN 20 for k3s nodes and services, Storage VLAN 30 for the NAS, and the existing default VLAN 1 for clients. This involved changing IPs on all VMs, updating MetalLB, reconfiguring Traefik, and recovering from an etcd quorum loss when I moved too many nodes at once. I also deployed the media stack (Jellyfin, Radarr, Sonarr, Prowlarr, Jellyseerr) and configured Intel iGPU passthrough infrastructure. ...

Self-Hosted CI/CD: Running GitHub Actions Runners on k3s

TL;DR Running self-hosted GitHub Actions runners on the same k3s cluster they deploy to is a powerful pattern. GitHub Actions Runner Controller (ARC) manages runner pods as Kubernetes resources, scaling them based on workflow demand. This post covers the full setup, the RBAC model that makes it work, and every gotcha I encountered. Why Self-Hosted Runners? GitHub-hosted runners are convenient but have limitations: Cost: Free tier gives 2,000 minutes/month. With 5+ repositories doing multiple deploys per day, that burns fast. Speed: GitHub-hosted runners are shared infrastructure. Cold starts take 20-30 seconds, and you are competing with other users. Access: GitHub-hosted runners cannot reach my private cluster network. Every deployment would need a VPN or tunnel. Control: I want to install whatever tools I need (kubectl, helm, terraform, ansible) without Docker layer caching tricks. Self-hosted runners solve all of these: they run inside the cluster, have direct network access to all services, pre-configured tools, and no usage limits. ...

Day One: Bootstrapping a k3s Cluster with Terraform and Ansible

TL;DR Today was cluster genesis. Starting from 3 bare Proxmox hosts, I built the entire infrastructure-as-code pipeline: Terraform to provision VMs from cloud-init templates, Ansible to configure and bootstrap k3s, and a full GitOps deployment model with SOPS-encrypted secrets and S3-backed Terraform state. By end of day: 3 server nodes, 3 agent nodes, cert-manager with Route53 DNS-01 validation, and self-hosted GitHub Actions runners on the cluster itself. The Architecture The design goal was simple: everything as code, nothing manual, everything reproducible. ...

Choosing the Hardware: Why I Went with Lenovo M920q for My Homelab

TL;DR After researching rack servers, NUCs, and mini PCs, I settled on the Lenovo ThinkCentre M920q as my homelab node of choice. At roughly $100-150 used, each unit packs an Intel 8th-gen Coffee Lake CPU, supports up to 64GB DDR4, has an NVMe slot, and sips around 15-25W. Three of these running Proxmox VE give me a proper HA cluster without the noise, heat, or power bill of traditional rack gear. ...