Jellyfin storage architecture diagram

Storage Refactoring and the SQLite-to-PostgreSQL Migration

TL;DR Phase 2 is the scariest phase. It’s where we take a running Jellyfin instance with years of playback history, user preferences, and media metadata — then swap the database from SQLite to PostgreSQL and restructure every volume. One wrong move and the family discovers their “Continue Watching” list is gone. This post covers deploying PostgreSQL as a k3s StatefulSet, restructuring Jellyfin’s volume layout from a monolithic RWO PVC to NFS shared config + Longhorn per-pod storage, and building a SQLite-to-PostgreSQL migration tool. ...

March 9, 2026 · 8 min · zolty
Monitoring goes blind — Longhorn storage corruption incident report

When Monitoring Goes Blind: A Longhorn Storage Corruption Incident

TL;DR Grafana went completely dark for about 26 hours on my home k3s cluster. Two things broke simultaneously: Loki entered CrashLoopBackOff, and Prometheus silently stopped ingesting metrics — its pods showed as healthy and 2/2 Running the whole time. The actual cause was Longhorn’s auto-balancer migrating replicas onto a freshly-added cluster node (k3s-agent-4) that had unstable storage during its first 48 hours. The replica I/O errors propagated directly into the workloads, corrupting mid-write files: a Prometheus WAL segment and a Loki TSDB index file. Both required offline surgery via a busybox pod to delete the corrupted files before the services could recover. ...

February 25, 2026 · 8 min · zolty

Affiliate Disclosure: Some links on this site are affiliate links (Amazon Associates, DigitalOcean referral). As an Amazon Associate, I earn from qualifying purchases. This does not affect the price you pay or my editorial independence — I only recommend products and services I personally use and trust.