Applications

Jellyfin state externalization architecture

State Externalization and the Sticky Session Compromise

TL;DR Phase 3 is where the rubber meets the road. We have PostgreSQL for persistent data (Day 4) and NFS for shared config. But Jellyfin still holds critical runtime state — sessions, users, devices, tasks — in 11 ConcurrentDictionary instances scattered across singleton managers. Two pods with independent memory spaces means two independent views of reality. This post covers the state externalization decision: what got moved to Redis, what got solved by sticky sessions, what got disabled entirely, and why pragmatism beat perfection for a homelab media server. ...

Jellyfin PostgreSQL database provider architecture

Forking Jellyfin: A PostgreSQL Database Provider in .NET 10

TL;DR Jellyfin stores everything in SQLite. Metadata, users, activity logs, authentication — all of it lives in .db files that lock under concurrent access. To run multiple replicas, we need a real network-accessible database. This post covers Phase 1 of the HA conversion: forking Jellyfin, designing a pluggable database provider interface, implementing it for PostgreSQL with Npgsql, generating EF Core migrations, writing integration tests with Testcontainers, and building a custom Docker image. ...

Jellyfin single-instance architecture diagram

Why Jellyfin Can't Scale (And What We're Going to Do About It)

TL;DR Jellyfin is a fantastic open-source media server. It is also, architecturally, a single-process application that assumes it’s the only instance running. SQLite as the database. Eleven ConcurrentDictionary caches holding sessions, users, devices, and task queues in memory. A file-based config directory that gets written to at runtime. None of this survives a second pod. This is the first post in a seven-part series documenting how I converted Jellyfin into a highly available, multi-replica deployment on my home k3s cluster. The project spans two repositories, four phases, ~20 GitHub Issues executed by AI agents, and a live failover demo where I killed a pod and the service continued with zero downtime — users on the surviving replica never saw an interruption. ...

Self-Hosted AI Chat: Open WebUI, LiteLLM, and AWS Bedrock on k3s

TL;DR I deployed a private, self-hosted ChatGPT alternative on the homelab k3s cluster. Open WebUI provides a polished chat interface. LiteLLM acts as a proxy that translates the OpenAI API into AWS Bedrock’s Converse API. Four models are available: Claude Sonnet 4, Claude Haiku 4.5, Amazon Nova Micro, and Amazon Nova Lite. Authentication is handled by the existing OAuth2 Proxy – no additional SSO configuration needed. The whole stack runs in three pods consuming under 500MB of RAM, and the only ongoing cost is per-request Bedrock pricing. No API keys from OpenAI or Anthropic required. ...

AI Dungeon Master platform architecture diagram

Building an AI Dungeon Master: Full-Stack D&D Platform on k3s

TL;DR I’m building a multiplayer D&D platform where an AI powered by AWS Bedrock Claude runs the game. Players connect via a Next.js web app or Discord. A 5-tier lore context system gives the AI persistent memory across sessions. A background world simulation engine tracks NPC positions, inventory, faction standings, and in-game time so the AI can focus on storytelling instead of bookkeeping. The foundation is fully deployed on my home k3s cluster. The current work is turning a working tech demo into a game people actually want to sit down and play. ...

Natural language media requests via Jellyseerr

I Am Zolty: Building a Natural Language Media Request System

TL;DR Jellyseerr already knows what I have. Radarr and Sonarr already know how to find things. The missing piece was a front door that understood intent instead of requiring me to search for specific titles. I wired Jellyseerr’s REST API to Claude and gave it a system prompt that knows my taste profile. Now I can say “download 100GB of family-friendly anime I might like” and get a queue of requests back. A Kubernetes CronJob runs the same prompt on a schedule so the library grows without me thinking about it. ...

Building a Complete Media Stack with Kubernetes

TL;DR The media stack is now fully automated: content gets sourced, synced from a remote seedbox to the local NAS via an rclone CronJob, organized by Radarr/Sonarr, and served by Jellyfin with Intel iGPU hardware transcoding. I also deployed a Media Controller for lifecycle management and a Media Profiler for content analysis. This post covers the full pipeline from acquisition to playback. The Media Pipeline Jellyseerr (request) │ ▼ Radarr/Sonarr (search + organize) │ ▼ Prowlarr (indexer management) │ ▼ Seedbox (remote acquisition) │ ▼ rclone-sync CronJob (seedbox → NAS) │ ▼ NAS (TrueNAS, VLAN 30) │ ▼ Jellyfin (playback + GPU transcode) Each component runs as a Kubernetes deployment in the media namespace. ...

Building an AI-Powered Alert System with AWS Bedrock

TL;DR Today I deployed two significant additions to the cluster: an AI-powered Alert Responder that uses AWS Bedrock (Amazon Nova Micro) to analyze Prometheus alerts and post remediation suggestions to Slack, and a multi-user dev workspace with per-user environments. I also hardened the cluster by constraining all workloads to the correct architecture nodes and fixing arm64 scheduling issues. The Alert Responder Running 13+ applications on a homelab cluster means alerts fire regularly. Most are straightforward — high memory, restart loops, certificate expiry warnings — but analyzing each one, determining root cause, and knowing the right remediation command gets tedious, especially at 2 AM. ...

Deploying a Microservices Architecture on k3s

TL;DR Today I deployed the most architecturally complex application on the cluster: a video service platform with a Vue.js frontend, 7 FastAPI backend microservices, NATS for messaging, PostgreSQL for persistence, and Redis for caching. This post covers the deployment patterns for NATS-based microservices on k3s and the RBAC fixes needed for Helm-based deployments. The Application Architecture The video service platform is a full microservices stack: ┌──────────────┐ │ Vue.js │ Frontend SPA │ Frontend │ └──────┬───────┘ │ HTTP/REST ┌──────┴───────────────────────────────────────┐ │ API Gateway │ └──────┬───────────────────────────────────────┘ │ ┌──────┴───────────────────────────────────────┐ │ FastAPI Microservices │ │ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ │ │ │Auth │ │Video│ │Media│ │Queue│ │ │ └─────┘ └─────┘ └─────┘ └─────┘ │ │ ┌─────┐ ┌─────┐ ┌─────┐ │ │ │Stats│ │User │ │Notif│ │ │ └─────┘ └─────┘ └─────┘ │ └──────────────────────────────────────────────┘ │ │ │ ┌────┴────┐ ┌────┴────┐ ┌────┴────┐ │PostgreSQL│ │ NATS │ │ Redis │ └─────────┘ └─────────┘ └─────────┘ Seven FastAPI services communicate via NATS for asynchronous messaging and Redis for shared state. PostgreSQL handles persistent data. ...

Migrating a Full-Stack App to Kubernetes: Digital Signage on k3s

TL;DR Today I migrated Digital Signage — an Angular SPA backed by 7 Flask microservices, an MQTT broker, and PostgreSQL — from a development environment to the k3s cluster. This is the most complex application on the cluster so far, and deploying it taught me a lot about managing multi-service applications in Kubernetes. The Application Digital Signage started as a side project back in May 2025, designed to drive informational displays on Raspberry Pi kiosk devices. It evolved over the months into a surprisingly complex system: ...