Projects

The Hugo, S3, CloudFront, and AI drafting pipeline behind this blog

How this blog is built: Hugo, S3, CloudFront, and an AI drafting pipeline

TL;DR This site is deliberately boring infrastructure for a reason: Hugo generates static HTML with the PaperMod theme. Terraform manages AWS (S3, CloudFront, Route53, ACM). GitHub Actions and self-hosted k3s runners deploy on every push to main. An AI pipeline (Bedrock + a Python script) drafts articles into Hugo page bundles and opens PRs for review. There’s no dynamic backend, no database, no server to maintain. The AWS bill is ~$30/month. This post is a tour of the machine that prints the other posts. ...

Parallel agents sweeping repos for improvements under a token budget

Token-budgeted self-improvement: pointing parallel agents at my own repos

TL;DR I have $X in monthly Claude tokens I don’t always use. Instead of letting the unused credit evaporate, I built a parallel agent sweep that fans out autonomous scouts to scan for dependency upgrades, CVEs, CI waste, and quick wins across my repos. Each discovery agent returns a scored candidate list. The orchestrator triages and ranks them, then spins up isolated worktree agents to implement the safe ones — all under a hard token cap and with human gates between phases. The output is a pile of merge requests, not silent commits. Noise is real and review burden is the limiting factor, but when it lands right, an hour of agent work + human review beats a weekend of manual maintenance. ...

A crowded Ultima Online street where every NPC has something to say

The peasant has friends now: rumors, routines, and a 3,200-strong crowd

TL;DR Last time I wrote about giving my Ultima Online shard’s NPCs a voice, a memory, and a small autonomous life. That post ended with “the peasant talks back now.” In the eight days since, the project grew six new systems: NPCs keep daily routines anchored to real places, every town runs a rumor board that traveling NPCs physically carry between cities, townsfolk gossip about players (your katana, your karma, your reputation), the GM avatar got actual powers governed by a genie rule, villagers hand out delivery quests, and a population director keeps every city stocked with 200 ambient “denizens” who hail you in the street. That’s ~3,200 new NPCs and maybe a dozen new LLM call sites, still running entirely on a local gemma-class model — the trick is that the model never gained a single new permission. Every new capability is deterministic code; the LLM still only ever produces words and picks verbs off allowlists. Also: I found out my RAG pipeline had been silently dead for days, and the lesson there is worth the price of admission. ...

An Ultima Online town NPC with a speech bubble driven by a local language model

When the peasant talks back: LLM NPCs in Ultima Online

TL;DR I run an Ultima Online shard on my homelab where the NPCs are driven by a local LLM instead of canned dialog trees. Each NPC rolls a persisted identity, remembers conversations with individual players across reboots, runs its own errands and cross-map journeys, and — the part I’m writing about today — strikes up ambient chatter with nearby NPCs on its own. The newest work extends all of that from townsfolk to language-speaking monsters: ogres, lizardmen, ratmen, gargoyles, daemons, and especially liches, who address each other like god-kings deigning to notice an insect. Inference is a local gemma-class model behind an in-cluster gateway, so it’s free and private, with the one tradeoff being cold-load latency. It’s single-shard hobby-scale and it absolutely shows the seams. I love it. ...

C# integration scripts wiring a local language model into an Ultima Online shard

How LLM-driven NPCs work in Ultima Online (ServUO)

TL;DR I open-sourced the integration that puts a local LLM behind the NPCs on my Ultima Online (ServUO) shard. It’s about 7,500 lines of C# that drop into a shard’s Scripts/Custom/ directory and compile at boot — no separate build, no service to deploy. This post is the code-level companion to the story version of the project: how config hot-reloads, how the model client marshals async results back onto the game thread, how the LLM is kept entirely out of the simulation loop, and how a deterministic allowlist makes a non-deterministic model safe to put in a stateful world. The whole thing is fail-open: if the model is slow, down, or wrong, the NPC silently degrades to a vanilla ServUO NPC. Code is on GitHub: ZoltyMat/uo-llm-npc. ...

Running AWS Lens as a Self-Hosted Web App on k3s

TL;DR AWS Lens is an open-source Electron desktop app for managing AWS resources — EC2, S3, Lambda, IAM, Cost Explorer, and more. I wanted it accessible from my browser without running a desktop app. I adapted it to run as a containerized Express server on k3s, fixed a class of runtime crashes from the Electron-to-web adapter, hardened it against three security issues, and deployed it behind Traefik and Let’s Encrypt. The changes are open-source in BoraKostem/AWS-Lens#21. ...