Fleet Dashboard
Container monitoring across the whole Rogue AI fleet
Built by Rogue AI · single-pane container monitoring for the self-hosted fleet · Internal tool
Built incrementally as the fleet grew; used daily as the operations console.
The problem
The fleet grew to roughly sixty containers spread across separate apps, each on its own network and ports, with its own database and Redis. Checking health meant running docker ps and tailing logs per app and holding the whole port-and-subnet map in your head. There was no single place to answer 'what is up, what is struggling, and where does it live?'
What I built
A single-page dashboard that shows the whole fleet at a glance: per-app cards with a color-coded health badge, CPU and memory bars, database sizes, and an expandable log viewer. It groups raw containers back into the apps they belong to, surfaces the cross-app port and subnet map, and lets the operator restart, rebuild, or pull an image straight from the card — with database containers deliberately excluded from those actions.
Architecture
Tech stack
What broke first
- ▸
Reading the Docker socket is a privileged operation, so it does not belong inside the user-facing web app. Isolating it in a separate sidecar with one narrow job keeps the blast radius small if the frontend is ever compromised.
- ▸
A single number is worth more than a wall of graphs. Most days the only question is 'is everything green?', so the dashboard answers that first and keeps per-container detail one click away.
- ▸
Polling a few dozen containers on every page view is enough to overwhelm the socket. Caching stats on the sidecar for a short window and refreshing on an interval keeps the host responsive without lying about state.
Outcome
Day-to-day fleet checks collapsed from a sweep of per-app terminal commands into one glance at one page. Privileged Docker access stays sealed in a single isolated sidecar rather than spread across the apps it watches, and routine restarts now happen from the dashboard with a record of who did what.
Honest limits
This is an internal operations tool, not a product. It runs in the local lab to watch a self-hosted fleet, and it was built solo for one operator. The real trade-off to call out: the Python sidecar holds privileged access to the Docker socket. Even mounted read-only and locked behind a shared secret on an isolated network with no external port, anything that can reach the socket is effectively root-adjacent on the host. That risk is accepted deliberately and contained by keeping it off the user-facing app — it is not eliminated.
