Why Jellyfin Can't Scale (And What We're Going to Do About It)

TL;DR

Jellyfin is a fantastic open-source media server. It is also, architecturally, a single-process application that assumes it’s the only instance running. SQLite as the database. Eleven ConcurrentDictionary caches holding sessions, users, devices, and task queues in memory. A file-based config directory that gets written to at runtime. None of this survives a second pod.

This is the first post in a seven-part series documenting how I converted Jellyfin into a highly available, multi-replica deployment on my home k3s cluster. The project spans two repositories, four phases, ~20 GitHub Issues executed by AI agents, and a live failover demo where I killed a pod and the service continued with zero downtime — users on the surviving replica never saw an interruption.

This post covers why the problem is harder than “just add replicas” and the strategy I chose to solve it.

The Problem

My Jellyfin deployment serves a household of users across TVs, phones, tablets, and laptops. When I need to drain a node for maintenance, or a pod crashes, or Longhorn decides to rebuild a volume — Jellyfin goes down hard. Every connected client shows “unable to connect.” Playback stops. The Android TV app spins. Someone yells from the living room.

This is the standard experience for every homelab media server. Plex, Emby, Jellyfin — they’re all designed as a single process running on a single machine. The Kubernetes community wraps them in Deployments with replicas: 1 and Recreate strategy and calls it a day.

I wanted to do better.

Why “Just Add Replicas” Doesn’t Work

If you set replicas: 2 on a vanilla Jellyfin Deployment, here’s what happens:

SQLite locks up. Two processes cannot write to the same SQLite database concurrently. The second pod either gets SQLITE_BUSY errors or silently corrupts the database file.
Sessions split. User sessions live in a ConcurrentDictionary<string, SessionInfo> in SessionManager.cs. Pod A knows you’re logged in. Pod B doesn’t. Every other request returns 401.
Config PVC is RWO. The Longhorn PVC holding /config is ReadWriteOnce. Only one pod can mount it. The second pod stays in Pending forever.
Transcodes conflict. FFmpeg writes transcode segments to disk. Two pods writing to the same directory produce garbled HLS playlists.
Scheduled tasks run twice. Library scans, image extraction, plugin updates — everything in TaskManager fires on both pods simultaneously, doubling the work and potentially corrupting metadata.
QuickConnect breaks. The pairing code flow stores pending requests in memory. You generate a code on Pod A, try to authorize it on Pod B — it doesn’t exist.

This isn’t a deployment problem. It’s an architecture problem.

Prior Art: What’s Been Tried Before

I’m not the first person to attempt this. The most impressive prior work is pseudopseudonym’s highly available Jellyfin setup, documented in a reddit post on r/selfhosted that got 207 upvotes and generated a genuinely useful discussion. Their setup runs Jellyfin across 3 dedicated Intel N100 hosts on a Proxmox cluster backed by Ceph (for config/database storage) and CubeFS (for petabyte-scale media storage), with Rook managing Ceph PVCs inside Kubernetes.

It works — there’s a video demo of Jellyfin surviving a full node power-off. But as pseudopseudonym acknowledged in the Reddit comments: it’s failover, not true HA. Only one Jellyfin instance runs at a time. When the active pod dies, Rook releases the Ceph RBD volume and a new pod claims it on another node. The bottleneck is the PVC release and pod reschedule cycle — even with aggressively tuned kubelet timers (node-status-update-frequency=4s, pod-eviction-timeout=30s), failover takes ~2.5 minutes. During that window, Jellyfin is completely down.

The fundamental constraint is SQLite. As long as Jellyfin uses SQLite, only one process can access the database at a time. The Ceph volume must be single-mount (RWO). Two pods means two SQLite writers, which means corruption.

A Hacker News discussion about SQLite concurrency captured the community debate perfectly. One commenter (a distributed systems architect) argued that Jellyfin’s single-instance assumptions make it harder to operate even on a single node, and that targeting PostgreSQL would be “strictly simpler from the perspective of the application.” Another pointed out that Jellyfin 10.11’s EF Core refactor — moving from raw Microsoft.Data.Sqlite calls to Entity Framework — finally created the foundation for a pluggable database layer.

That EF Core refactor is what makes this project possible. Without it, adding PostgreSQL support would have required rewriting hundreds of raw SQL queries. With EF Core, it’s a provider swap.

How This Project Differs

Aspect	pseudopseudonym’s Approach	This Project
Architecture	Passive failover (1 pod at a time)	Active-active (2 pods, both serving traffic)
Database	SQLite on Ceph RBD (RWO)	PostgreSQL StatefulSet (network-accessible)
Failover time	~2.5 minutes (PVC release + reschedule)	Zero downtime (surviving pod serves continuously; displaced clients reconnect in ~12s)
Storage stack	Ceph + CubeFS + Rook	NFS + Longhorn (much simpler)
State coordination	None (single instance)	Redis leader election + sticky sessions
Hardware scale	864 EPYC cores, 3.75 TiB RAM, 64 K8s workers	7-node k3s cluster, consumer hardware
Custom tooling	`rook-push` (Rook PVC taint automation)	`IJellyfinDatabaseProvider` (pluggable DB layer)

Their approach is elegant for its constraints: given that SQLite is non-negotiable, shared block storage with single-mount failover is the best you can do. This project removes the constraint. By replacing SQLite with PostgreSQL and externalizing state coordination to Redis, we can run multiple pods simultaneously — something that’s architecturally impossible with SQLite.

The tradeoff: they didn’t need to fork Jellyfin. We do.

The In-Memory State Inventory

Before designing any solution, I needed to understand exactly what state Jellyfin holds in memory. I went through the codebase (Jellyfin 10.12.0, .NET 10) and cataloged every ConcurrentDictionary and stateful singleton:

Manager	State	Data Structure	HA Impact
`SessionManager`	Active user sessions	`ConcurrentDictionary<string, SessionInfo>`	Critical — every API request checks session
`SessionManager`	Live stream sessions	`ConcurrentDictionary<string, ConcurrentDictionary<string, string>>`	Critical — active playback tracking
`UserManager`	All users (cached)	`ConcurrentDictionary<Guid, User>`	High — auth on every request
`DeviceManager`	Client capabilities	`ConcurrentDictionary<string, ClientCapabilities>`	Medium — client feature negotiation
`QuickConnectManager`	Pending pairing requests	`ConcurrentDictionary<string, QuickConnectResult>`	High — pairing fails across instances
`QuickConnectManager`	Authorized secrets	`ConcurrentDictionary<string, (DateTime, AuthenticationResult)>`	High
`SyncPlayManager`	Active groups	3 interconnected `ConcurrentDictionary` instances	Deferred — too complex, low usage
`TaskManager`	Task queue	`ConcurrentQueue<Tuple<Type, TaskOptions>>` with `lock`	High — duplicate task execution
`ProviderManager`	Metadata refreshes	`ConcurrentDictionary<Guid, double>` + `PriorityQueue`	Medium — progress tracking
`TranscodeManager`	Active FFmpeg jobs	`List<TranscodingJob>` with `lock` + `AsyncKeyedLocker`	Deferred — solved by sticky sessions
`ChannelManager`	Channel media cache	`IMemoryCache`	Low — regenerable

That’s 11 managers holding critical state purely in memory. Some of them — SessionManager and UserManager — are consulted on every single API request.

Three Strategies

I evaluated three approaches:

Strategy A: Active-Passive

Run two pods, but only one serves traffic. The other is warm standby. On failure, Traefik routes to the standby.

Pros: Simple. No state sharing needed. Cons: Wastes resources. The standby pod does nothing most of the time. Failover requires the standby to warm its caches (user list, session state) from scratch — slow.

Strategy B: Active-Active with Sticky Sessions

Run two pods, both serving traffic. Traefik uses cookie-based affinity to route returning clients to the same pod. State stays local per pod. On failure, clients get routed to the surviving pod and re-authenticate.

Pros: Both pods serve traffic. Simple sticky session config. Both pods have GPU access for hardware transcoding. Cons: Session loss on pod death. Some features (SyncPlay, QuickConnect) only work on the pod that initiated them.

Strategy C: True Multi-Master

Externalize all state to Redis. Any pod can serve any request. Full session portability.

Pros: Seamless failover. No client-visible impact on pod death. Cons: Requires modifying 6+ core Jellyfin classes (SessionManager, UserManager, DeviceManager, QuickConnectManager, TaskManager). Fork maintenance burden compounds monthly when upstream releases. SyncPlayManager needs real-time sub-millisecond coordination that’s impractical to externalize.

The Decision

Strategy B — Active-Active with sticky sessions. Here’s why:

FFmpeg transcoding is an OS process. It writes segments to local disk. You cannot distribute a single transcode job across pods. Sticky sessions ensure the client that started a transcode stays on the pod running it. This solves transcoding HA without touching Jellyfin’s code at all.

For a homelab with a handful of concurrent users, the tradeoff of “session loss on pod death requires clicking Play again” is acceptable. This is no worse than the current behavior when the single pod restarts — which happens every time I drain a node.

Strategy C (full Redis externalization) was kept as an optional escalation path (“Track B”) in case Track A proved insufficient after real-world testing.

The Four-Phase Plan

With the strategy decided, I broke the project into four sequential phases:

Phase	Name	Scope	What Changes
1	PostgreSQL Database Provider	Fork Jellyfin, create Npgsql EF Core provider, generate migrations	Jellyfin fork repo
2	Storage Architecture	Deploy PostgreSQL, restructure volumes, migrate data	k3s cluster manifests
3	State Externalization	Deploy Redis, disable conflicting features, operational coordination	Both repos
4	Multi-Replica Deployment	Scale to 2 replicas, sticky sessions, failover testing, monitoring	k3s cluster manifests

Each phase produces a working, validated state. Phase 1 builds and tests pass. Phase 2 deploys and serves users. Phase 3 adds coordination. Phase 4 flips the switch.

The dependency chain is strict: you can’t restructure storage until the database provider exists. You can’t add replicas until state conflicts are resolved. You can’t test failover until replicas are running.

Architecture: Before and After

Before (Single Pod)

Jellyfin single pod architecture — ASP.NET Core, SQLite, FFmpeg, 11 ConcurrentDictionary caches, RWO config PVC

After (Two Pods)

Jellyfin HA architecture — Traefik with sticky sessions routing to two pods, backed by PostgreSQL, Redis, and NFS shared storage

The key structural changes:

SQLite → PostgreSQL — network-accessible, concurrent writes, row-level locking
RWO config PVC → NFS RWX — both pods mount the same config directory
emptyDir cache → per-pod Longhorn PVCs — each pod gets its own transcode and cache storage via StatefulSet volumeClaimTemplates
Redis — shared state store for session coordination (or at minimum, graceful degradation)
Traefik sticky sessions — cookie-based affinity routes clients to the same pod

Why 2 Replicas, Not 3+

Factor	Reasoning
Memory footprint	Each replica needs 1-4 GiB RAM. Three replicas = 12 GiB earmarked for one media server.
Failure domains	2 replicas on different nodes covers single-node failure. For a 7-node homelab cluster, this is sufficient.
GPU availability	All nodes have Intel UHD 630 iGPUs with GPU passthrough from Proxmox — both replicas get QuickSync hardware transcoding. A third replica adds availability but not capacity.
PostgreSQL	Single PG instance handles Jellyfin’s read/write volume easily. Not a bottleneck until hundreds of concurrent users.

The Multi-Model Execution Plan

I used the same multi-model planning pattern that shipped dnd-multi in a single day — the pattern I documented in Two AIs, One Codebase. The next post in this series covers how I adapted that pattern for infrastructure work across two repositories.

The short version: four AI models reviewed the plan before any code was written. Each found different gaps. Claude Opus 4.6 synthesized the final execution document. GitHub Copilot agent implemented the code via surgical GitHub Issues. Claude Sonnet 4.6 in VS Code reviewed diffs, corrected mistakes, and merged PRs.

Zero lines of code were written by hand. The planning phase is where the human time went.

Coming Up Next

Tomorrow: how I adapted the multi-model planning workflow for a project that spans a .NET fork and a Kubernetes infrastructure repo, and what each model caught that the others missed.

Want to follow along? The Jellyfin fork with the PostgreSQL provider will be open-sourced later this week — stay tuned for the link in Day 3.

Don’t have a homelab? A Jellyfin instance with PostgreSQL can run on any cloud Kubernetes provider. A $200-credit DigitalOcean account is enough to host the entire stack for months.

TL;DR#

The Problem#

Why “Just Add Replicas” Doesn’t Work#

Prior Art: What’s Been Tried Before#

How This Project Differs#

The In-Memory State Inventory#

Three Strategies#

Strategy A: Active-Passive#

Strategy B: Active-Active with Sticky Sessions#

Strategy C: True Multi-Master#

The Decision#

The Four-Phase Plan#

Architecture: Before and After#

Before (Single Pod)#

After (Two Pods)#

Why 2 Replicas, Not 3+#

The Multi-Model Execution Plan#

Coming Up Next#