AI-powered alert analysis

Building an AI-Powered Alert System with AWS Bedrock

TL;DR Today I deployed two significant additions to the cluster: an AI-powered Alert Responder that uses AWS Bedrock (Amazon Nova Micro) to analyze Prometheus alerts and post remediation suggestions to Slack, and a multi-user dev workspace with per-user environments. I also hardened the cluster by constraining all workloads to the correct architecture nodes and fixing arm64 scheduling issues. The Alert Responder Running 13+ applications on a homelab cluster means alerts fire regularly. Most are straightforward — high memory, restart loops, certificate expiry warnings — but analyzing each one, determining root cause, and knowing the right remediation command gets tedious, especially at 2 AM. ...

February 14, 2026 · 5 min · zolty