What does Rogue AI build?

Production AI systems for European businesses — RAG pipelines with pgvector and hybrid retrieval, AI agent orchestration with parallel execution and MCP integration, LLM integration connecting models to business APIs, and AI-powered security tools analyzing firewall rules across 33 vendors.

How much does a project cost?

AI Starter projects start at EUR 3,000-5,000 (2 weeks) for single-feature tools. AI Workflow projects run EUR 5,000-10,000 (3-4 weeks) for multi-step automation. AI Platform builds range EUR 10,000-20,000 (4-6 weeks) for complete systems with infrastructure.

How long does a typical project take?

2-4 weeks for a standard deployment. This includes requirements analysis, system design, development, testing, Docker containerization, and production deployment with monitoring. Complex multi-system integrations may take 4-6 weeks.

Where is my data processed?

All AI processing runs on EU-hosted infrastructure (Hetzner, Germany). We use locally-deployed models via Ollama — no data is sent to US cloud APIs like OpenAI or AWS. Documents are processed in-memory and automatically deleted. Your data never leaves the EU.

What industries do you serve?

European SMBs across maritime (document compliance), legal (contract analysis), construction (safety compliance), and any business needing AI automation. We build industry-specific tools with live demos you can test before committing.

Can I try before I buy?

Yes. Our industry demos are free with no signup — Maritime DocAI, Legal DocAI, and Construction DocAI are all live. Upload a real document and see results in 15-25 seconds. These demos show exactly the kind of tool we build for clients.

What tech stack do you use?

Next.js, React, TypeScript, PostgreSQL with pgvector for vector search, Redis for caching, Ollama for local LLM inference, Docker for containerization, and Caddy for reverse proxy. All systems deployed on Hetzner VPS with automated health monitoring.

Who operates Rogue AI?

Rogue AI is operated by Netshift Advisory Ltd (HE 489261), a Cyprus-registered company. The founder brings 17 years of cybersecurity experience across DACH enterprise markets with certifications including AI-102, AZ-500, CEH, ISO 27001 Lead Implementer, TOGAF 9, and CCIE Security.

Technical Guide

AI Agent Orchestration: Multi-Agent Systems for Production

Rogue AI

·2026-04-01·11 min read

Building a single AI agent that demos well takes an afternoon. Building multi-agent systems that run reliably in production — handling tool failures, maintaining state across sessions, executing tasks in parallel, integrating with real APIs and databases — is a substantially harder engineering problem. This guide covers what actually changes when you move from prototype to production.

This is based on deploying a 100+ skill agent fleet (including 60+ custom Claude Code skills, MCP integrations, and parallel execution workflows) running in production. The lessons are hard-won.

What Makes Multi-Agent Systems Hard

Tool failures cascade

When Agent A depends on the output of Agent B, and Agent B's tool call fails, the entire workflow breaks without explicit error handling. Production agents need retry logic, fallback paths, and graceful degradation at every tool call.

Context windows fill up fast

Long-running agents accumulate context. Tool call results, intermediate reasoning, and previous steps consume the context window before the task completes. Summarization and context management are non-optional.

State persistence across sessions is hard

Agents that restart a task from scratch every session are useless for long-horizon work. Production agents need persistent memory that survives process restarts and session boundaries.

The Production Agent Architecture

Tool Design

Tool quality is the primary determinant of agent quality. Bad tools produce bad agents regardless of the model. A well-designed tool:

Does one thing

A tool named search_and_summarize_and_email will be used incorrectly. Split it into three tools.

Returns structured output

Return typed JSON with status, data, and error fields. Never return raw strings the agent must parse.

Has explicit error states

Return { status: 'error', reason: '...' } rather than throwing exceptions. Agents handle structured errors much better than stack traces.

Is idempotent where possible

Tools that can be called twice safely allow retry logic without side effects. Especially important for write operations.

Parallel Execution

Most agent workflows have independent subtasks that can run in parallel. Sequential execution is the default but wastes significant time.

# Sequential (slow): 3 API calls × 2s each = 6s total
result_a = await agent.call_tool("search_competitors")
result_b = await agent.call_tool("fetch_pricing")
result_c = await agent.call_tool("analyze_reviews")

# Parallel (fast): 3 API calls × max(2s) = 2s total
results = await asyncio.gather(
agent.call_tool("search_competitors"),
agent.call_tool("fetch_pricing"),
agent.call_tool("analyze_reviews")
)

The Claude API supports parallel tool calls natively — the model returns multiple tool_use blocks in a single response. Parse all of them and execute in parallel before sending the next message.

Persistent Memory

There are two types of agent memory worth implementing:

Session memory (Redis)

Conversation history and working state for the current task. Lives in Redis with a TTL. Allows an interrupted task to resume where it left off.

Long-term memory (PostgreSQL)

Facts, preferences, decisions, and outcomes the agent should remember across sessions. Stored as structured records with semantic search via pgvector for retrieval.

MCP Integration

The Model Context Protocol (MCP) standardizes how agents connect to external systems. Instead of writing custom tool wrappers for every service, MCP servers expose a standard interface that any MCP-compatible agent can use. This is the right abstraction layer for production agent integrations.

What MCP enables

• Connect agents to GitHub, databases, browsers, file systems, and custom APIs via a unified protocol
• Swap underlying implementations without changing agent code
• Compose complex workflows from MCP server combinations
• Debug tool calls with standardized logging and tracing

Skills Architecture (100+ Skills at Scale)

For agents with many capabilities, skills-based architecture separates the agent controller from the capability implementation:

Skill as a markdown spec

Each skill is a markdown file describing what it does, when to use it, and step-by-step instructions. The agent loads the relevant skill at runtime.

Skill discovery

Index all skills with embeddings. When a task arrives, retrieve the top-3 relevant skills and inject them into the agent's context before execution.

Skill versioning

Treat skills like code — version control them, review changes, and roll back when a skill produces bad outputs.

Building an Agent System?

Rogue AI designs and builds multi-agent systems with 100+ skills, parallel execution, persistent memory, and MCP integration. From architecture to production deployment in Docker.

Get in touch or see all capabilities.