TL;DR

The digital signage system was pulling weather from OpenWeatherMap, calendar events from Google Calendar, and device status from MQTT – three separate API keys, three separate failure modes. Home Assistant already had all of this data. I built an HA proxy service that exposes weather, forecasts, calendar events, temperature sensors, and arbitrary entity queries through a single Flask API backed by the Home Assistant REST API. Five new endpoints replaced three external dependencies. I also added API key authentication with role-based access control, wrote 37 tests, fixed MQTT addressing after a VLAN migration, and fought through 6 CI/CD fixes to get the pipeline deploying on self-hosted ARC runners.

The Problem: Three APIs Where One Would Do

The digital signage stack runs on k3s – an Angular frontend backed by 7 Flask microservices communicating over MQTT. When I first deployed it in February, each widget that needed external data had its own API integration:

  • Weather widget: OpenWeatherMap API key, direct HTTP calls from the quotes service
  • Calendar widget: Google Calendar service account, OAuth2 token refresh, a separate Flask endpoint
  • Indoor temperature: Direct MQTT subscription to sensor topics

This worked, but it had three problems:

  1. Key management overhead. Three API keys to rotate, three sets of credentials in Kubernetes secrets, three services to restart when a key changes.
  2. Redundant data. Home Assistant already tracks weather, calendars, and every temperature sensor in the house. The data was being fetched twice – once by HA, once by the signage services.
  3. No graceful degradation. If the OpenWeatherMap API returned a 429, the weather widget showed a blank screen. No fallback, no cached data, no error message.

The HA Proxy Solution

Home Assistant exposes a REST API that can query any entity state, call any service, and fetch calendar events. The HA proxy is a Flask service that translates digital signage widget requests into HA API calls:

EndpointHA SourceReplaces
/api/ha/weatherweather.forecast_home entityOpenWeatherMap API
/api/ha/weather/forecastweather.get_forecasts service callOpenWeatherMap forecast
/api/ha/calendarAll calendar.* entitiesGoogle Calendar API
/api/ha/temperaturesFiltered sensor.* by device_class: temperatureDirect MQTT
/api/ha/entitiesAny entity by domain, device_class, or keywordN/A (new capability)

The /api/ha/entities endpoint is the most useful one. It is a generic query interface: pass a domain like light, a device class like temperature, or a keyword like living_room, and it returns matching entities with their current state. New widgets can be built against this endpoint without touching any backend code.

Graceful Degradation

The original services crashed or returned 500s when their external API was unavailable. The HA proxy handles three failure modes:

  1. No HA token configured. Returns structured empty responses ({"conditions": "unavailable"}) instead of 503 errors. The frontend renders a “data unavailable” state instead of a blank screen.
  2. HA unreachable. Rate-limited error logging (one log line per 60 seconds, not per request) and cached last-known-good responses.
  3. Entity not found. Returns empty arrays or null values with appropriate HTTP status codes. The widget decides how to render missing data.

API Key Authentication

The digital signage API was completely open – no authentication on any endpoint. For an internal-only service behind Traefik ingress with no external exposure, this was acceptable initially. But the HA proxy endpoints expose real-time data about the house (temperatures, calendar events, device states), and the settings API lets any client reconfigure the display layout.

The auth system uses API keys with role-based access:

  • Public endpoints (no key required): health checks, GET settings/layouts, device registration
  • User endpoints (valid API key): GET HA data, read chores
  • Admin endpoints (admin role): write settings, manage layouts, delete devices, manage users

Keys are stored hashed in the database. A require_api_key decorator validates the X-API-Key header against the users table. A require_role decorator gates admin-only operations. Fifteen auth-specific tests cover the key validation, role checks, and edge cases (expired keys, missing headers, invalid roles).

CI/CD: Six Fixes to Ship One Feature

Getting this deployed on self-hosted ARC runners required six separate CI fixes:

  1. --provenance=false on docker build. Without it, Docker Desktop adds attestation manifests that k3s containerd cannot resolve. This is the same fix documented in the cluster’s anti-patterns.

  2. docker buildx vs docker build. The --provenance flag only exists in docker buildx build. ARC runners have docker but not always buildx – reverted to plain docker build which defaults to no provenance on the runner’s Docker version.

  3. venv for test dependencies. ARC runners share a node-level Python installation. Installing pytest and pytest-cov globally conflicts with other services. Each CI job now creates a venv.

  4. pytest-cov and addopts override. The project’s pytest.ini defines addopts = --cov. CI needs to override this to avoid coverage failures on partial test runs. Added pytest-cov to CI dependencies and let the coverage flag through.

  5. Docker Hub rate limit. ARC runners pulling nginx:alpine during multi-stage builds hit Docker Hub’s anonymous rate limit. Fixed by mirroring nginx:alpine to ECR and hardcoding the ECR registry in the Dockerfile.

  6. ECR registry in Dockerfile. The FROM line needs the full ECR URI. Parameterizing it with ARG was considered and rejected – the registry does not change between environments, and ARG adds a layer cache-bust that slows builds.

MQTT IP Migration

A quiet bug from the VLAN migration surfaced: four files still referenced the Mosquitto MQTT broker at 192.168.1.203 (the old flat network) instead of 192.168.20.203 (the new Server VLAN 20 address). MQTT connections worked intermittently because the old IP was sometimes reachable via the VLAN router, but with enough latency to cause message delivery failures.

The fix was a find-and-replace across four files. The lesson was the same one from every VLAN migration: grep the entire codebase for old IPs after the migration, not just the infrastructure files.

Test Coverage

The session added 37 tests total:

  • 22 unit tests for quotes, weather, and HA proxy services
  • 15 authentication tests for key validation, role checks, and edge cases

The test infrastructure itself was new – pytest was not previously configured for this project. Adding conftest.py, test fixtures for Flask app contexts, and mock HA responses was as much work as writing the tests themselves.

What’s Next

The HA proxy opens the door for new widgets without any backend work. A humidity widget, a light status panel, an energy usage display – all queryable via /api/ha/entities. The next cycle focuses on the Angular frontend: building these widgets against the new endpoints and replacing the legacy direct-API integrations.