Skip to main content
Private AI · a platform I build

Private AI
that never leaves the building.

A complete AI assistant over your own documents and systems, running entirely on hardware you control — built from a self-hosted fleet I run today. Data never leaves the building, every answer is sourced and logged, and the stack is secured the way I secure networks: not bolted on afterward.

Isometric cutaway of a glass-walled, self-hosted secure server room with an AI core glowing at its centre — the data sealed inside the building.
Fully local inferenceSource-cited answersImmutable audit logEU / self-hosted
Why this exists

The models are ready. The data isn't allowed to leave.

Regulated and data-sensitive organisations — finance, insurance, healthcare, legal, professional services — want what cloud AI does, but over their own contracts, policies, and records. Sending that data to a US-hosted API is a non-starter under GDPR, NIS2, DORA, and plain client confidentiality.

The usual answers are “wait” or “buy a six-figure appliance.” There is a third option: run it yourself, as software, on infrastructure you already own.

The platform

Four layers. All self-hosted.

Each layer is real, running code — extracted from a fleet of 20+ self-hosted AI applications I run today, not a slide.

Layer 01

Local model serving

Open-weight models served locally through Ollama, routed via a hardened provider gateway with rate-limiting and metrics. Switch models per task. Nothing calls out to a third party.

Layer 02

Zero-egress retrieval

RAG over your own documents with a source citation on every answer — auto-chunking, dedup, and reranking built in. The model can't cite a clause it doesn't have, and can't invent one that isn't there.

Layer 03

Assistant & agents

A chat assistant grounded in your data ships in the first install. Multi-step agents that take real actions across your systems — observable, resumable, idempotent on retry — already run across the fleet and are packaged into the product next.

Layer 04

Governance, audit & access

Every question and answer written to an immutable log. Role-based access, least-privilege database users per workload, secrets in environment only, capabilities dropped at the container.

The difference

Secured by a security engineer — not bolted on.

Seventeen years hardening DAX-30 enterprise networks sits underneath this stack. Capability-dropped containers, per-app network isolation, prompt-injection defence, and secrets hygiene are the starting point — not a finding in next year's audit. The same discipline that secures a regulated firewall estate, applied to your private AI.

A design decision

Why software, not an appliance.

I built it as software you run yourself, rather than a hardware box — for a few deliberate reasons.

An appliance would mean
  • A hardware purchase, sized and paid for up front
  • A box to house, power, and maintain
  • Tied to one appliance's hardware and roadmap
  • Capacity fixed at purchase — used or not
Software means
  • Runs on your hardware, your VPC, or a box you already own
  • Installs with one command — docker compose up
  • No appliance, no capex, no vendor lock-in
  • You own and can audit every layer of the stack
Where it is today

Start with one job, done completely.

The first deployment does one thing end-to-end, properly — before anything is added on top.

  1. Point it at your documents
    A folder, a share, an export — your contracts, SOPs, policies, records.
  2. Ask in plain language
    It answers only from those documents. No outside knowledge bleeds in.
  3. Every answer cites its sources
    No source, no claim. You can click straight to the passage it used.
  4. Every exchange is logged
    Question, answer, and sources written to an immutable audit trail.
  5. It all runs locally
    docker compose up on your own box. No external API anywhere in the answer path.

Honest status: the core is real and runs today, extracted from a self-hosted fleet I operate myself. The product packaging is early — connectors (SharePoint, core systems), agents, per-task models, and multi-tenant deployment come next. It's ready for a pilot, not a press release.

Where it matters

The rooms where data can't leave.

It matters most where confidential data can't cross the perimeter — regulated, data-sensitive work across the EU and DACH.

Regulated finance (MiFID II · DORA)InsuranceHealthcareLegal & professional servicesData-sensitive SMEsPublic sector

One of the systems I build.

If you're solving the same problem — useful AI over data that can't leave the building — get in touch.

Get in touch
Questions

Straight answers.

Does any of our data leave our infrastructure?
No. Models, retrieval, and storage all run locally on hardware you control. There is no external API in the answer path unless you explicitly choose to add one.
Do we need to buy a special appliance?
No. It is software. It runs on your existing servers, a private cloud (VPC), or a single GPU box you already own.
How is this different from a cloud AI assistant?
A cloud assistant sends your data to a third-party provider. This runs entirely inside your perimeter. The trade-off is that you host it — and for regulated data, that's the entire point.
Is it finished?
The core is real and runs today; the product packaging is early. The honest answer is that it's ready for a pilot, not a press release — which is exactly when a design partner gets the most influence over it.
What models does it use?
Open-weight models (Llama, Qwen, and similar) served via Ollama. You can swap them at any time — you are never locked to a single vendor or model.