Voice AI services on k3s Voice AI services on k3s

Voice AI on k3s: Whisper, Piper, and openWakeWord in Kubernetes

TL;DR I deployed Whisper (speech-to-text), Piper (text-to-speech), and openWakeWord (wake word detection) as Kubernetes workloads on my k3s cluster. Home Assistant connects to them over the Wyoming protocol for fully local voice pipelines. Total resource cost: ~1 CPU core and 1.75GB RAM. Total cloud cost: $0. Why run voice services in Kubernetes Home Assistant’s voice pipeline needs three things: something to listen for a wake word, something to transcribe speech, and something to speak back. The usual approach is running these on the same box as HA, or on a dedicated Pi. Both work fine until you want the services to survive node failures, be independently upgradeable, or share resources with other workloads. ...

April 7, 2026 · 6 min · zolty
AWS Lens running as a web server on k3s AWS Lens running as a web server on k3s

Running AWS Lens as a Self-Hosted Web App on k3s

TL;DR AWS Lens is an open-source Electron desktop app for managing AWS resources — EC2, S3, Lambda, IAM, Cost Explorer, and more. I wanted it accessible from my browser without running a desktop app. I adapted it to run as a containerized Express server on k3s, fixed a class of runtime crashes from the Electron-to-web adapter, hardened it against three security issues, and deployed it behind Traefik and Let’s Encrypt. The changes are open-source in BoraKostem/AWS-Lens#21. ...

March 30, 2026 · 7 min · zolty
Week of March 23 retrospective Week of March 23 retrospective

Week of March 23: Security Patches, AI Tooling, and Defending the Homelab on Reddit

TL;DR Busy week. Three CVE patches shipped on the same day. OpenClaw stabilized with OpenRouter support and a cost exporter. The Wiki.js fork with Mermaid 11 went live after clearing a Trivy scan. PiKey — a Raspberry Pi that pretends to be a Bluetooth keyboard — shipped as a side project. A self-hosted GitHub Actions cache server cut CI restore times from minutes to seconds. And a Reddit comment defending “I use Claude to manage my infrastructure” turned into five new blog posts and a documentation sprint. ...

March 29, 2026 · 6 min · zolty
Wiki.js fork with Mermaid 11 Wiki.js fork with Mermaid 11

Forking Wiki.js to Get Mermaid 11: When Upstream Won't Move

TL;DR Wiki.js 2.x ships Mermaid 8.8.2, released in 2020. Mermaid 11 — the current stable release — adds timeline diagrams, improved gitGraph, better theming, and fixes years of rendering bugs. The upstream project defers this upgrade to Wiki.js v3, which has no release date. The PR queue has sat idle for over a year. I forked requarks/wiki at tag v2.5.312, upgraded Mermaid in-place, patched 8 CVEs including one Critical SAML authentication bypass, reduced the vulnerability count from 8 Critical / 48 High to 3 Critical / 42 High, and deployed it to the cluster. The fork stays close to upstream — Vue 2 and Webpack 4 are untouched. Only the Mermaid surface and security dependencies are modified. ...

March 29, 2026 · 5 min · zolty
Self-hosted GitHub Actions cache server Self-hosted GitHub Actions cache server

Self-Hosting a GitHub Actions Cache Server on NAS Storage

TL;DR If you run self-hosted GitHub Actions runners, every actions/cache step is round-tripping to GitHub’s cloud storage. For a homelab cluster with local runners, that means cache restores travel from GitHub’s CDN to your runner, through your ISP, and back – even though the runner is 10 feet from your NAS. I deployed falcondev-oss/github-actions-cache-server as a Kubernetes deployment, pointed it at NFS storage on my NAS, set one environment variable on my runners, and flushed all the GitHub-hosted caches. Zero workflow changes required. ...

March 27, 2026 · 5 min · zolty
K3s stability improvements K3s stability improvements

Two Months of K3s Stability Improvements

TL;DR Over the past two months, I have made a series of stability improvements to my k3s homelab cluster. The biggest wins: migrating from AWS ECR to self-hosted Harbor (eliminating 12-hour token expiry), fixing recurring Grafana crashes caused by SQLite corruption on Longhorn, recovering pve4 after a failed LXC experiment, hardening NetworkPolicies to close gaps in pod-to-host traffic rules, and patching multiple CVEs across the media stack. The cluster now runs 7/7 nodes on k3s v1.34.4, all services monitored, all images pulled from Harbor with static credentials that never expire. ...

March 27, 2026 · 8 min · zolty
Authentik identity platform Authentik identity platform

Planning Authentik: Centralized Identity for a Homelab

TL;DR I am deploying Authentik as a centralized identity provider for my k3s cluster. It replaces the current OAuth2 Proxy setup with proper SSO, federates Google as a social login source, and introduces group-based RBAC (admins, writers, readers) across all services. The migration is phased – public services first via Traefik forwardAuth, then internal services via native OIDC, then proxy-protected apps that have no OIDC support. OAuth2 Proxy stays in git for instant rollback. This post covers the architecture, the user model, the edge security design, and the gotchas I expect to hit. ...

March 27, 2026 · 7 min · zolty
Jellyfin hardware stress tester Jellyfin hardware stress tester

Stress Testing GPU Transcoding in Kubernetes with JF_hw_stress

TL;DR JF_hw_stress is a headless transcoding stress tester that answers one question: how many concurrent transcode streams can your GPU actually handle before quality degrades? It runs escalating FFmpeg transcodes against real media files using VAAPI hardware acceleration, measures FPS ratios, and outputs a JSON report. I run it as a Kubernetes Job on the same k3s cluster from Cluster Genesis, scheduled exclusively on the GPU node (Intel UHD 630). The job auto-deletes after 10 minutes so it does not accumulate stale pods. ...

March 27, 2026 · 6 min · zolty
Homelab infrastructure overview Homelab infrastructure overview

Homelab State of the Union: 42 Namespaces and Counting

TL;DR The homelab runs 42 Kubernetes namespaces across 7 nodes (3 control plane, 4 workers) on 4 Lenovo ThinkCentre M920q mini PCs running Proxmox VE. This post is the result of a full infrastructure audit — reconciling what’s actually running against what’s documented, catching version drift, and noting what’s been added, removed, or broken since the last check. Compute Four Lenovo ThinkCentre M920q nodes form the physical layer: Host CPU RAM NVMe Role pve1 i5-8500T 32GB 512GB 1 server VM + 1 agent VM pve2 i5-8500T 32GB 512GB 1 server VM + 1 agent VM pve3 i5-8500T 32GB 512GB 1 server VM + 1 agent VM pve4 i7-8700T 32GB 512GB 1 agent VM (GPU passthrough) The k3s cluster runs v1.34.4+k3s1 with embedded etcd for HA. All 7 nodes report Ready. Server VMs get 2 cores and 6GB each — just enough for etcd and the API server. Agent VMs are beefier: 6 cores and 22GB on pve1-3, 12 cores and 28GB on pve4. ...

March 26, 2026 · 6 min · zolty
OpenClaw multi-user AI gateway OpenClaw multi-user AI gateway

OpenClaw Multi-User: Privacy, Dual AI Backends, and Per-User Cost Tracking

TL;DR Multi-user AI chat with privacy guarantees, dual model providers (Anthropic direct API + AWS Bedrock via LiteLLM), and per-user cost tracking via Prometheus and Grafana. The admin cannot read other users’ conversations. Three family members authenticate via Google OAuth, each getting isolated chat sessions. Anthropic serves as the primary model provider with lower latency, and Bedrock via LiteLLM acts as a fallback. Per-user spend is tracked through LiteLLM’s Prometheus metrics without any surveillance of conversation content. This is a follow-up to the OpenClaw on k3s setup post. ...

March 25, 2026 · 13 min · zolty

Affiliate Disclosure: Some links on this site are affiliate links (Amazon Associates, DigitalOcean referral). As an Amazon Associate, I earn from qualifying purchases. This does not affect the price you pay or my editorial independence — I only recommend products and services I personally use and trust.