TL;DR
The digital signage system was pulling weather from OpenWeatherMap, calendar events from Google Calendar, and device status from MQTT – three separate API keys, three separate failure modes. Home Assistant already had all of this data. I built an HA proxy service that exposes weather, forecasts, calendar events, temperature sensors, and arbitrary entity queries through a single Flask API backed by the Home Assistant REST API. Five new endpoints replaced three external dependencies. I also added API key authentication with role-based access control, wrote 37 tests, fixed MQTT addressing after a VLAN migration, and fought through 6 CI/CD fixes to get the pipeline deploying on self-hosted ARC runners.
The Problem: Three APIs Where One Would Do
The digital signage stack runs on k3s – an Angular frontend backed by 7 Flask microservices communicating over MQTT. When I first deployed it in February, each widget that needed external data had its own API integration:
- Weather widget: OpenWeatherMap API key, direct HTTP calls from the quotes service
- Calendar widget: Google Calendar service account, OAuth2 token refresh, a separate Flask endpoint
- Indoor temperature: Direct MQTT subscription to sensor topics
This worked, but it had three problems:
- Key management overhead. Three API keys to rotate, three sets of credentials in Kubernetes secrets, three services to restart when a key changes.
- Redundant data. Home Assistant already tracks weather, calendars, and every temperature sensor in the house. The data was being fetched twice – once by HA, once by the signage services.
- No graceful degradation. If the OpenWeatherMap API returned a 429, the weather widget showed a blank screen. No fallback, no cached data, no error message.
The HA Proxy Solution
Home Assistant exposes a REST API that can query any entity state, call any service, and fetch calendar events. The HA proxy is a Flask service that translates digital signage widget requests into HA API calls:
| Endpoint | HA Source | Replaces |
|---|---|---|
/api/ha/weather | weather.forecast_home entity | OpenWeatherMap API |
/api/ha/weather/forecast | weather.get_forecasts service call | OpenWeatherMap forecast |
/api/ha/calendar | All calendar.* entities | Google Calendar API |
/api/ha/temperatures | Filtered sensor.* by device_class: temperature | Direct MQTT |
/api/ha/entities | Any entity by domain, device_class, or keyword | N/A (new capability) |
The /api/ha/entities endpoint is the most useful one. It is a generic query interface: pass a domain like light, a device class like temperature, or a keyword like living_room, and it returns matching entities with their current state. New widgets can be built against this endpoint without touching any backend code.
Graceful Degradation
The original services crashed or returned 500s when their external API was unavailable. The HA proxy handles three failure modes:
- No HA token configured. Returns structured empty responses (
{"conditions": "unavailable"}) instead of 503 errors. The frontend renders a “data unavailable” state instead of a blank screen. - HA unreachable. Rate-limited error logging (one log line per 60 seconds, not per request) and cached last-known-good responses.
- Entity not found. Returns empty arrays or null values with appropriate HTTP status codes. The widget decides how to render missing data.
API Key Authentication
The digital signage API was completely open – no authentication on any endpoint. For an internal-only service behind Traefik ingress with no external exposure, this was acceptable initially. But the HA proxy endpoints expose real-time data about the house (temperatures, calendar events, device states), and the settings API lets any client reconfigure the display layout.
The auth system uses API keys with role-based access:
- Public endpoints (no key required): health checks, GET settings/layouts, device registration
- User endpoints (valid API key): GET HA data, read chores
- Admin endpoints (admin role): write settings, manage layouts, delete devices, manage users
Keys are stored hashed in the database. A require_api_key decorator validates the X-API-Key header against the users table. A require_role decorator gates admin-only operations. Fifteen auth-specific tests cover the key validation, role checks, and edge cases (expired keys, missing headers, invalid roles).
CI/CD: Six Fixes to Ship One Feature
Getting this deployed on self-hosted ARC runners required six separate CI fixes:
--provenance=falseondocker build. Without it, Docker Desktop adds attestation manifests that k3s containerd cannot resolve. This is the same fix documented in the cluster’s anti-patterns.docker buildxvsdocker build. The--provenanceflag only exists indocker buildx build. ARC runners havedockerbut not alwaysbuildx– reverted to plaindocker buildwhich defaults to no provenance on the runner’s Docker version.venv for test dependencies. ARC runners share a node-level Python installation. Installing
pytestandpytest-covglobally conflicts with other services. Each CI job now creates a venv.pytest-covandaddoptsoverride. The project’spytest.inidefinesaddopts = --cov. CI needs to override this to avoid coverage failures on partial test runs. Addedpytest-covto CI dependencies and let the coverage flag through.Docker Hub rate limit. ARC runners pulling
nginx:alpineduring multi-stage builds hit Docker Hub’s anonymous rate limit. Fixed by mirroringnginx:alpineto ECR and hardcoding the ECR registry in the Dockerfile.ECR registry in Dockerfile. The
FROMline needs the full ECR URI. Parameterizing it withARGwas considered and rejected – the registry does not change between environments, andARGadds a layer cache-bust that slows builds.
MQTT IP Migration
A quiet bug from the VLAN migration surfaced: four files still referenced the Mosquitto MQTT broker at 192.168.1.203 (the old flat network) instead of 192.168.20.203 (the new Server VLAN 20 address). MQTT connections worked intermittently because the old IP was sometimes reachable via the VLAN router, but with enough latency to cause message delivery failures.
The fix was a find-and-replace across four files. The lesson was the same one from every VLAN migration: grep the entire codebase for old IPs after the migration, not just the infrastructure files.
Test Coverage
The session added 37 tests total:
- 22 unit tests for quotes, weather, and HA proxy services
- 15 authentication tests for key validation, role checks, and edge cases
The test infrastructure itself was new – pytest was not previously configured for this project. Adding conftest.py, test fixtures for Flask app contexts, and mock HA responses was as much work as writing the tests themselves.
What’s Next
The HA proxy opens the door for new widgets without any backend work. A humidity widget, a light status panel, an energy usage display – all queryable via /api/ha/entities. The next cycle focuses on the Angular frontend: building these widgets against the new endpoints and replacing the legacy direct-API integrations.