TL;DR

A homelab feels free until you read the meter. After a year of running seven k3s nodes plus a pair of Mac Studios under whatever workload I felt like throwing at them, I sat down with a Kill-a-Watt and worked out what the cluster actually costs to keep on. Idle is genuinely cheap. Sustained LLM inference is not. The honest break-even against cloud inference is workload-shaped, and for my workloads, on-prem wins — but only because I run them often enough to amortize the wattage. The numbers below are mine; substitute your electricity rate to get yours.

Why measure at all

Every “I built a homelab” post I’ve read skips the electric bill, and most of them skip the cost of cooling the room the gear lives in. Both are real. Both are also small enough that you can ignore them right up until you can’t — adding a second Mac Studio to the cluster moved my draw enough that I noticed the difference in the room temperature before I noticed it on the bill.

So I instrumented. The goal wasn’t a perfect measurement; it was a number I could plug into the on-prem-vs-cloud decision the next time I was about to buy hardware on instinct.

Methodology

  • Instrument: a Kill-a-Watt P4400 on each circuit, one at a time. Not precision lab gear; perfectly fine for this.
  • Two states: idle meaning the box has booted, all services are up, no jobs are running; and load meaning a sustained 30-minute workload representative of how that machine actually gets used.
  • Sampling: I sampled each box every 10 minutes for a full week to get an average that reflects real usage rather than a snapshot.
  • Cross-check: my utility provides 15-minute interval data via their portal. Summing per-circuit Kill-a-Watt readings matched the meter within ~5%, which is good enough.

The Kill-a-Watt P4400 is the cheapest tool that’s been worth its money in the lab. About $25 and answers a question that costs you real dollars per month otherwise.

The k3s fleet — 7 × ThinkCentre M920q

The same ThinkCentre M920q I covered in the cluster genesis post, measured per-node:

StatePer-node draw
Idle (k3s up, no pods working)11–14W
Steady (k3s + ~30 pods doing housekeeping)22–30W
Spiky (CI runners or image pulls hitting the box)55–70W

The long-run average across seven nodes lands around 210W for the whole k3s fleet. That’s about $1 per node per month in electricity at typical rates, which is the most pleasant surprise in this entire exercise.

The Mac Studio inference pair

Two Mac Studio M3 Ultras — the same pair I covered in the trillion-param MoE post and the Mac Studio observability post:

StatePer-machine draw
Idle~28W
Light inference (8B–14B param model, batch 1)60–90W
Heavy inference (1T-MoE sharded across both)280–320W

Mac Studios are spiky. The peaks are large; the duty cycle is small. My actual day-to-day average is closer to ~80W per Studio because I’m not running LLMs 24/7 — even though the observability dashboards tell me I should be.

The supporting cast

  • Ubiquiti USW Aggregation: ~22W
  • Two smaller PoE switches downstream: ~14W combined
  • Synology NAS: ~32W average, ~52W during scrubs
  • Router, ProtonVPN gateway host, Pi-key, misc. small bits: ~25W aggregate

Nothing in this list moves the needle on its own. Together they’re another ~93W I’d otherwise forget about.

What the meter actually says

Adding it up:

ComponentAverage draw
7× ThinkCentre M920q~210W
2× Mac Studio M3 Ultra~160W (spiky avg)
Switches~36W
NAS~32W
Misc~25W
Total~460W

0.46 kW × 24 × 30 ≈ 331 kWh/month continuous.

I’m not posting my electricity rate — figure out your own. Here’s the math at a few common ones:

Your rate ($/kWh)Monthly cost
0.12~$40
0.16~$53
0.20~$66
0.30~$99

For most readers that’s a meaningful but not insane number. It is, however, a number that didn’t exist on the spreadsheet when I was deciding whether to buy the second Mac Studio.

BTU math — the second bill

Watts dissipate as heat. The conversion is:

1W ≈ 3.412 BTU/hr

So my average 460W = ~1,570 BTU/hr the room has to shed continuously.

During an inference burst — both Studios cranking, adding ~500W on top of baseline — that climbs to over 3,300 BTU/hr, close to half of a 5,000-BTU window AC running flat-out.

In a cold climate the cluster is free heat for half the year — genuinely useful, you can subtract it from your heating bill. In a hot climate it’s a second electricity bill: your AC has to remove every watt the cluster generates, at roughly 1:1 efficiency for a window unit and 2-3:1 for central air. So the true summer operating cost can be 1.3–2× the cluster meter shows. None of the homelab posts I read mentioned this. It’s the most significant cost in summer.

The ROI question — own vs rent

Two honest ways to ask this, and they give different answers.

Option A: replicate the homelab with cloud equivalents

This is a silly comparison because the homelab isn’t a single workload — it’s dozens of always-on services. But for the curious: seven always-on EC2 t3.medium instances are ~$210/mo. Add EBS, transfer, NAT gateway, and you’re well past $500/mo before you’ve started thinking about inference. So on the “replace everything with managed cloud” axis, the homelab pays for itself comfortably.

This isn’t the comparison most people are actually making, though.

Option B: just look at the inference workload

This is the honest comparison. The Mac Studios are the expensive part of the stack and they exist to run LLMs locally instead of paying per token.

If I run sustained inference 4 hours a day at ~600W average:

  • 0.6 kW × 4 hr × 30 = 72 kWh/mo
  • At $0.16/kWh = ~$11.50/month in electricity for the inference workload itself
  • Equivalent volume via API: even a modest 100M input + 20M output tokens on a frontier model is in the $300–600/month range

That math is too favorable to local. The honest version: the Mac Studio capex was substantial, and most of my workloads happily fit on Sonnet via API at well under $100/mo. The local rig wins when the workload is large, privacy-sensitive, or experiment-shaped — not because it’s cheaper per inference, but because it’s flat-cost per inference and I can experiment without watching a meter.

Don’t have a homelab? This same math runs in the opposite direction for most people. A few hundred bucks of DigitalOcean credits will cover a year of small experiments and you don’t have to buy a Mac Studio or warm a room you weren’t planning to heat.

What I’d do differently

  • Measure before buying. I bought the Mac Studio pair on a “this will be useful” instinct. The electricity math came after. Both are fine; the order was backwards.
  • Right-size the fleet. Seven nodes is more than I need — I built for the next bottleneck and inherited a 200W floor I don’t want. Five would be the right number if I were redoing it.
  • Model BTUs from day one. In summer the cluster’s effect on the room AC is non-trivial. I didn’t account for it and got surprised the first hot week.

Lessons

  • Idle is much cheaper than people assume. A ThinkCentre M920q at idle costs single-digit dollars per year in electricity at typical rates. The cost is in active workloads, not in keeping a node powered on.
  • Sustained inference is the line item. Everything else combined is rounding error compared to two Mac Studios running an LLM. Plan capex around the workloads that will burn watts continuously, not the ones that sit at 5% CPU.
  • The cloud comparison is workload-shaped. “Is homelab cheaper than cloud” has no answer until you specify for what workload, at what utilization. Idle-heavy fleets win on-prem. Burst-only workloads almost always win in the cloud. Sustained inference depends on token volume.
  • The room is part of the system. Your AC bill is your second electricity bill in summer. Model it.

What’s next

Wiring the Kill-a-Watt readings into Prometheus via a USB-serial scraper is the obvious next step — once you know the numbers exist, you want them on a Grafana dashboard alongside everything else. The hardware exists; the laziness has been the bottleneck.