The Secure Feedback Loop: Matrix as a DevOps Control Plane

The Secure Feedback Loop: Matrix as a DevOps Control Plane
Jackal Seeks

The problem

Most DevOps setups keep monitoring data in one place (InfluxDB, Loki) and send alerts to another (email, Slack). When something breaks, you're switching between tabs to piece together what happened. The alert doesn't have the context, and the dashboard doesn't have the conversation. On top of that, Slack and email aren't encrypted end-to-end, which matters when alerts contain infrastructure metadata.

The idea: treat Matrix rooms as event journals

Matrix isn't just chat. It's a distributed, encrypted, federated state machine with a DAG for persistence. So instead of using it for conversation, use it as the place where every verification step -- build checks, health probes, deployment results -- gets recorded as a room event. The room becomes the journal.

These experiments are to test how that capability can be used.

Architecture: split access through Caddy

For these expermiments, Caddy sits in front as a reverse proxy with two paths in:

  • Internal: Local services (Telegraf, Python collectors) and the Fedora workstation talk over a private bridge. Trusted, no extra auth.
  • External: GitHub Actions and remote deployment targets (Debian boxes) push results through an OIDC-authenticated gateway.

Synapse handles identity (SSO), persistence (the DAG), and encryption (E2EE) between them.

Adding AI to the loop

Matrix already supports a one-to-many observer pattern, which makes it straightforward to drop AI agents into a room alongside human operators. The flow:

  1. A solti-monitoring probe fails.
  2. A Matrix bot (the observer) picks up the failure event.
  3. An AI agent reads the event, checks it against the documentation in the digital garden, and posts a suggested fix or diagnostic as a reaction in the room.

This uses MCP (Model Context Protocol) to give the agent access to relevant context.

What this gets you

Verification stays inside encrypted rooms instead of scattered across email threads and Slack channels. GitHub and other public-internet tools stay on the outside of the gateway. You can iterate quickly without leaking infrastructure secrets, and the AI agent means fewer context switches when something breaks -- the diagnosis shows up where the failure was recorded.


Next steps

To turn this into an implementation plan, three things need pinning down:

  1. Event schema: What does a verification failure payload actually look like in JSON?
  2. Caddy config: The specific split-access rules for jackaltx.com.
  3. Bot spec: Which Python Matrix library to use, and how the one-to-many listeners work.

Subscribe to Lavweb Projects

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe