Domain interviewer bot architecture

AI Agents Work Better When They Actually Know How You Operate

TL;DR AI agents fail when they don’t know what you know. I built a Slack bot that conducts structured 5-layer interviews to extract tacit knowledge — operating rhythms, decision criteria, dependencies, friction points, leverage opportunities — and generates soul.md, user.md, and heartbeat.md config files for provisioning agents. The interview surfaces ~30% more actionable context than documentation alone. Full source code below. The Problem Nobody’s Talking About Nate B. Jones has a video that nails the core issue with AI agents: they fail because they lack tacit knowledge. Not the stuff in your docs — the stuff in your head. The 20-year veteran who just knows that the staging deploy takes longer on Thursdays because the batch job runs. The designer who can feel when a color palette is wrong without being able to articulate why. ...

April 16, 2026 · 11 min · zolty
Self-hosted AI setup with OpenClaw and Ollama

Self-Hosted AI on a 24GB GPU: OpenClaw + Ollama Setup Guide for Windows

TL;DR You have a 24GB VRAM GPU. You want a private, self-hosted AI assistant that rivals ChatGPT – no subscriptions, no data leaving your machine. This guide walks you through setting up Ollama (local model runtime) and OpenClaw (AI gateway with a web UI) on Windows using Docker Desktop. But the real value here is the model recommendations. I ran 5,475 evaluations across 21 prompt variants and 6 models on real trading data. The results contradicted almost everything the community recommends. Finance-tuned models performed worse than a coin flip. Chain-of-thought reasoning models were anti-patterns. The winners were general-purpose MoE (Mixture-of-Experts) models that nobody talks about for specialized tasks. ...

April 14, 2026 · 21 min · zolty
Dream Workers autonomous cluster agent

Dream Workers: Letting an AI Agent Improve Your Cluster While You Sleep

TL;DR I built an “Ops Dream Worker” — a Kubernetes CronJob that runs at 3 AM, inspects the cluster, identifies improvements, and files GitHub issues with specific fixes. It runs entirely on local models (Mac Studio M3 Ultra), costs $0 per run, and went through 240 A/B test iterations to optimize the prompts. The anti-hallucination patterns were harder to get right than the analysis itself. The idea I have a k3s cluster with ~40 deployed services. I maintain it solo. There’s always something that could be better — a deployment missing resource limits, a CronJob that’s been failing silently, an ingress without SSO protection, a container image with known CVEs. These improvements pile up because I’m usually focused on building features, not auditing infrastructure. ...

April 8, 2026 · 8 min · zolty
OpenClaw vs Claude Code architecture comparison

OpenClaw vs Claude Code: An Architectural Comparison

TL;DR Someone leaked the Claude Code source on GitHub. OpenClaw, the open-source AI coding agent with 346k stars, solves the same problem with a completely different architecture. I compared both codebases at the structural level. The verdict: these are independent implementations that converge on the same tool-use patterns because that is what the problem demands — not because one copied the other. Background In late March 2026, a repository appeared on GitHub containing what appears to be the full source code for Anthropic’s Claude Code — the terminal-based AI coding agent I wrote about switching to last month. The repo has two commits (“init” and “add readme”), 1,932 files, and weighs 43MB. ...

April 2, 2026 · 11 min · zolty
OpenClaw multi-user AI gateway

OpenClaw Multi-User: Privacy, Dual AI Backends, and Per-User Cost Tracking

TL;DR Multi-user AI chat with privacy guarantees, dual model providers (Anthropic direct API + AWS Bedrock via LiteLLM), and per-user cost tracking via Prometheus and Grafana. The admin cannot read other users’ conversations. Three family members authenticate via Google OAuth, each getting isolated chat sessions. Anthropic serves as the primary model provider with lower latency, and Bedrock via LiteLLM acts as a fallback. Per-user spend is tracked through LiteLLM’s Prometheus metrics without any surveillance of conversation content. This is a follow-up to the OpenClaw on k3s setup post. ...

March 25, 2026 · 13 min · zolty
OpenClaw AI gateway on k3s

OpenClaw on k3s: Replacing Open WebUI with a Lighter AI Gateway

TL;DR I replaced Open WebUI with OpenClaw – a lighter, WebSocket-based AI assistant gateway that installs from npm, supports multiple chat channels (web, Telegram, Discord, WhatsApp), and deploys on k3s as a single Deployment with a custom Docker image. The primary model provider is Anthropic’s direct API (Claude Sonnet 4.5), with LiteLLM/Bedrock as a fallback. The biggest deployment lesson: OpenClaw binds to loopback by default, which makes it invisible to Kubernetes Services and health probes. The fix is --bind lan, which requires a gateway token for authentication. ...

March 23, 2026 · 13 min · zolty

Affiliate Disclosure: Some links on this site are affiliate links (Amazon Associates, DigitalOcean referral). As an Amazon Associate, I earn from qualifying purchases. This does not affect the price you pay or my editorial independence — I only recommend products and services I personally use and trust.