TL;DR
Jellyfin is a fantastic open-source media server. It is also, architecturally, a single-process application that assumes it’s the only instance running. SQLite as the database. Eleven ConcurrentDictionary caches holding sessions, users, devices, and task queues in memory. A file-based config directory that gets written to at runtime. None of this survives a second pod.
This is the first post in a seven-part series documenting how I converted Jellyfin into a highly available, multi-replica deployment on my home k3s cluster. The project spans two repositories, four phases, ~20 GitHub Issues executed by AI agents, and a live failover demo where I killed a pod and the service continued with zero downtime — users on the surviving replica never saw an interruption.
This post covers why the problem is harder than “just add replicas” and the strategy I chose to solve it.
The Problem
My Jellyfin deployment serves a household of users across TVs, phones, tablets, and laptops. When I need to drain a node for maintenance, or a pod crashes, or Longhorn decides to rebuild a volume — Jellyfin goes down hard. Every connected client shows “unable to connect.” Playback stops. The Android TV app spins. Someone yells from the living room.
This is the standard experience for every homelab media server. Plex, Emby, Jellyfin — they’re all designed as a single process running on a single machine. The Kubernetes community wraps them in Deployments with replicas: 1 and Recreate strategy and calls it a day.
I wanted to do better.
Why “Just Add Replicas” Doesn’t Work
If you set replicas: 2 on a vanilla Jellyfin Deployment, here’s what happens:
SQLite locks up. Two processes cannot write to the same SQLite database concurrently. The second pod either gets
SQLITE_BUSYerrors or silently corrupts the database file.Sessions split. User sessions live in a
ConcurrentDictionary<string, SessionInfo>inSessionManager.cs. Pod A knows you’re logged in. Pod B doesn’t. Every other request returns 401.Config PVC is RWO. The Longhorn PVC holding
/configisReadWriteOnce. Only one pod can mount it. The second pod stays inPendingforever.Transcodes conflict. FFmpeg writes transcode segments to disk. Two pods writing to the same directory produce garbled HLS playlists.
Scheduled tasks run twice. Library scans, image extraction, plugin updates — everything in
TaskManagerfires on both pods simultaneously, doubling the work and potentially corrupting metadata.QuickConnect breaks. The pairing code flow stores pending requests in memory. You generate a code on Pod A, try to authorize it on Pod B — it doesn’t exist.
This isn’t a deployment problem. It’s an architecture problem.
Prior Art: What’s Been Tried Before
I’m not the first person to attempt this. The most impressive prior work is pseudopseudonym’s highly available Jellyfin setup, documented in a reddit post on r/selfhosted that got 207 upvotes and generated a genuinely useful discussion. Their setup runs Jellyfin across 3 dedicated Intel N100 hosts on a Proxmox cluster backed by Ceph (for config/database storage) and CubeFS (for petabyte-scale media storage), with Rook managing Ceph PVCs inside Kubernetes.
It works — there’s a video demo of Jellyfin surviving a full node power-off. But as pseudopseudonym acknowledged in the Reddit comments: it’s failover, not true HA. Only one Jellyfin instance runs at a time. When the active pod dies, Rook releases the Ceph RBD volume and a new pod claims it on another node. The bottleneck is the PVC release and pod reschedule cycle — even with aggressively tuned kubelet timers (node-status-update-frequency=4s, pod-eviction-timeout=30s), failover takes ~2.5 minutes. During that window, Jellyfin is completely down.
The fundamental constraint is SQLite. As long as Jellyfin uses SQLite, only one process can access the database at a time. The Ceph volume must be single-mount (RWO). Two pods means two SQLite writers, which means corruption.
A Hacker News discussion about SQLite concurrency captured the community debate perfectly. One commenter (a distributed systems architect) argued that Jellyfin’s single-instance assumptions make it harder to operate even on a single node, and that targeting PostgreSQL would be “strictly simpler from the perspective of the application.” Another pointed out that Jellyfin 10.11’s EF Core refactor — moving from raw Microsoft.Data.Sqlite calls to Entity Framework — finally created the foundation for a pluggable database layer.
That EF Core refactor is what makes this project possible. Without it, adding PostgreSQL support would have required rewriting hundreds of raw SQL queries. With EF Core, it’s a provider swap.
How This Project Differs
| Aspect | pseudopseudonym’s Approach | This Project |
|---|---|---|
| Architecture | Passive failover (1 pod at a time) | Active-active (2 pods, both serving traffic) |
| Database | SQLite on Ceph RBD (RWO) | PostgreSQL StatefulSet (network-accessible) |
| Failover time | ~2.5 minutes (PVC release + reschedule) | Zero downtime (surviving pod serves continuously; displaced clients reconnect in ~12s) |
| Storage stack | Ceph + CubeFS + Rook | NFS + Longhorn (much simpler) |
| State coordination | None (single instance) | Redis leader election + sticky sessions |
| Hardware scale | 864 EPYC cores, 3.75 TiB RAM, 64 K8s workers | 7-node k3s cluster, consumer hardware |
| Custom tooling | rook-push (Rook PVC taint automation) | IJellyfinDatabaseProvider (pluggable DB layer) |
Their approach is elegant for its constraints: given that SQLite is non-negotiable, shared block storage with single-mount failover is the best you can do. This project removes the constraint. By replacing SQLite with PostgreSQL and externalizing state coordination to Redis, we can run multiple pods simultaneously — something that’s architecturally impossible with SQLite.
The tradeoff: they didn’t need to fork Jellyfin. We do.
The In-Memory State Inventory
Before designing any solution, I needed to understand exactly what state Jellyfin holds in memory. I went through the codebase (Jellyfin 10.12.0, .NET 10) and cataloged every ConcurrentDictionary and stateful singleton:
| Manager | State | Data Structure | HA Impact |
|---|---|---|---|
SessionManager | Active user sessions | ConcurrentDictionary<string, SessionInfo> | Critical — every API request checks session |
SessionManager | Live stream sessions | ConcurrentDictionary<string, ConcurrentDictionary<string, string>> | Critical — active playback tracking |
UserManager | All users (cached) | ConcurrentDictionary<Guid, User> | High — auth on every request |
DeviceManager | Client capabilities | ConcurrentDictionary<string, ClientCapabilities> | Medium — client feature negotiation |
QuickConnectManager | Pending pairing requests | ConcurrentDictionary<string, QuickConnectResult> | High — pairing fails across instances |
QuickConnectManager | Authorized secrets | ConcurrentDictionary<string, (DateTime, AuthenticationResult)> | High |
SyncPlayManager | Active groups | 3 interconnected ConcurrentDictionary instances | Deferred — too complex, low usage |
TaskManager | Task queue | ConcurrentQueue<Tuple<Type, TaskOptions>> with lock | High — duplicate task execution |
ProviderManager | Metadata refreshes | ConcurrentDictionary<Guid, double> + PriorityQueue | Medium — progress tracking |
TranscodeManager | Active FFmpeg jobs | List<TranscodingJob> with lock + AsyncKeyedLocker | Deferred — solved by sticky sessions |
ChannelManager | Channel media cache | IMemoryCache | Low — regenerable |
That’s 11 managers holding critical state purely in memory. Some of them — SessionManager and UserManager — are consulted on every single API request.
Three Strategies
I evaluated three approaches:
Strategy A: Active-Passive
Run two pods, but only one serves traffic. The other is warm standby. On failure, Traefik routes to the standby.
Pros: Simple. No state sharing needed. Cons: Wastes resources. The standby pod does nothing most of the time. Failover requires the standby to warm its caches (user list, session state) from scratch — slow.
Strategy B: Active-Active with Sticky Sessions
Run two pods, both serving traffic. Traefik uses cookie-based affinity to route returning clients to the same pod. State stays local per pod. On failure, clients get routed to the surviving pod and re-authenticate.
Pros: Both pods serve traffic. Simple sticky session config. Both pods have GPU access for hardware transcoding. Cons: Session loss on pod death. Some features (SyncPlay, QuickConnect) only work on the pod that initiated them.
Strategy C: True Multi-Master
Externalize all state to Redis. Any pod can serve any request. Full session portability.
Pros: Seamless failover. No client-visible impact on pod death.
Cons: Requires modifying 6+ core Jellyfin classes (SessionManager, UserManager, DeviceManager, QuickConnectManager, TaskManager). Fork maintenance burden compounds monthly when upstream releases. SyncPlayManager needs real-time sub-millisecond coordination that’s impractical to externalize.
The Decision
Strategy B — Active-Active with sticky sessions. Here’s why:
FFmpeg transcoding is an OS process. It writes segments to local disk. You cannot distribute a single transcode job across pods. Sticky sessions ensure the client that started a transcode stays on the pod running it. This solves transcoding HA without touching Jellyfin’s code at all.
For a homelab with a handful of concurrent users, the tradeoff of “session loss on pod death requires clicking Play again” is acceptable. This is no worse than the current behavior when the single pod restarts — which happens every time I drain a node.
Strategy C (full Redis externalization) was kept as an optional escalation path (“Track B”) in case Track A proved insufficient after real-world testing.
The Four-Phase Plan
With the strategy decided, I broke the project into four sequential phases:
| Phase | Name | Scope | What Changes |
|---|---|---|---|
| 1 | PostgreSQL Database Provider | Fork Jellyfin, create Npgsql EF Core provider, generate migrations | Jellyfin fork repo |
| 2 | Storage Architecture | Deploy PostgreSQL, restructure volumes, migrate data | k3s cluster manifests |
| 3 | State Externalization | Deploy Redis, disable conflicting features, operational coordination | Both repos |
| 4 | Multi-Replica Deployment | Scale to 2 replicas, sticky sessions, failover testing, monitoring | k3s cluster manifests |
Each phase produces a working, validated state. Phase 1 builds and tests pass. Phase 2 deploys and serves users. Phase 3 adds coordination. Phase 4 flips the switch.
The dependency chain is strict: you can’t restructure storage until the database provider exists. You can’t add replicas until state conflicts are resolved. You can’t test failover until replicas are running.
Architecture: Before and After
Before (Single Pod)
After (Two Pods)
The key structural changes:
- SQLite → PostgreSQL — network-accessible, concurrent writes, row-level locking
- RWO config PVC → NFS RWX — both pods mount the same config directory
- emptyDir cache → per-pod Longhorn PVCs — each pod gets its own transcode and cache storage via StatefulSet
volumeClaimTemplates - Redis — shared state store for session coordination (or at minimum, graceful degradation)
- Traefik sticky sessions — cookie-based affinity routes clients to the same pod
Why 2 Replicas, Not 3+
| Factor | Reasoning |
|---|---|
| Memory footprint | Each replica needs 1-4 GiB RAM. Three replicas = 12 GiB earmarked for one media server. |
| Failure domains | 2 replicas on different nodes covers single-node failure. For a 7-node homelab cluster, this is sufficient. |
| GPU availability | All nodes have Intel UHD 630 iGPUs with GPU passthrough from Proxmox — both replicas get QuickSync hardware transcoding. A third replica adds availability but not capacity. |
| PostgreSQL | Single PG instance handles Jellyfin’s read/write volume easily. Not a bottleneck until hundreds of concurrent users. |
The Multi-Model Execution Plan
I used the same multi-model planning pattern that shipped dnd-multi in a single day — the pattern I documented in Two AIs, One Codebase. The next post in this series covers how I adapted that pattern for infrastructure work across two repositories.
The short version: four AI models reviewed the plan before any code was written. Each found different gaps. Claude Opus 4.6 synthesized the final execution document. GitHub Copilot agent implemented the code via surgical GitHub Issues. Claude Sonnet 4.6 in VS Code reviewed diffs, corrected mistakes, and merged PRs.
Zero lines of code were written by hand. The planning phase is where the human time went.
Coming Up Next
Tomorrow: how I adapted the multi-model planning workflow for a project that spans a .NET fork and a Kubernetes infrastructure repo, and what each model caught that the others missed.
Want to follow along? The Jellyfin fork with the PostgreSQL provider will be open-sourced later this week — stay tuned for the link in Day 3.
Don’t have a homelab? A Jellyfin instance with PostgreSQL can run on any cloud Kubernetes provider. A $200-credit DigitalOcean account is enough to host the entire stack for months.