Private AI · a platform I build

Private AI
that never leaves the building.

An AI assistant over your own documents and systems, running on hardware you control. Data stays inside your perimeter, every answer cites its source and lands in an audit log, and the stack is hardened with the same discipline I use for regulated networks.

See how it's built Discuss a pilot

Isometric cutaway of a glass-walled, self-hosted secure server room with an AI core glowing at its centre, the data sealed inside the building.

Fully local inferenceSource-cited answersImmutable audit logEU / self-hosted

Why this exists

The models are ready. The data isn't allowed to leave.

Regulated and data-sensitive organisations, finance, insurance, healthcare, legal, professional services, want what cloud AI does, but over their own contracts, policies, and records. Sending that data to a US-hosted API is a non-starter under GDPR, NIS2, DORA, and plain client confidentiality.

The usual answers are “wait” or “buy a six-figure appliance.” There is a third option: run it yourself, as software, on infrastructure you already own.

The platform

Four layers. All self-hosted.

Each layer comes from running code extracted from a fleet of 20+ self-hosted AI applications.

Layer 01

Local model serving

Open-weight models served locally through Ollama, routed via a hardened provider gateway with rate-limiting and metrics. Switch models per task. Nothing calls out to a third party.

Layer 02

Zero-egress retrieval

RAG over your own documents with a source citation on every answer, auto-chunking, dedup, and reranking built in. No source, no claim.

Layer 03

Assistant & agents

A chat assistant grounded in your data ships in the first install. Multi-step agents that take real actions across your systems, observable, resumable, idempotent on retry, already run across the fleet and are packaged into the product next.

Layer 04

Governance, audit & access

Every question and answer written to an immutable log. Role-based access, least-privilege database users per workload, secrets in environment only, capabilities dropped at the container.

The difference

Secured by a security engineer, not bolted on later.

Enterprise-network hardening discipline sits underneath this stack. Capability-dropped containers, per-app network isolation, prompt-injection defence, and secrets hygiene are the starting point. The same discipline that secures a regulated firewall estate, applied to private AI.

A design decision

Why software, not an appliance.

I built it as software you run yourself, rather than a hardware box, for a few deliberate reasons.

An appliance would mean

A hardware purchase, sized and paid for up front
A box to house, power, and maintain
Tied to one appliance's hardware and roadmap
Capacity fixed at purchase, used or not

Software means

Runs on your hardware, your VPC, or a box you already own
Installs with one command, docker compose up
No appliance, no capex, no vendor lock-in
You own and can audit every layer of the stack

Where it is today

Start with one job, done completely.

The first deployment does one thing end-to-end, properly, before anything is added on top.

Point it at your documents
A folder, a share, an export, your contracts, SOPs, policies, records.
Ask in plain language
It answers only from those documents. No outside knowledge bleeds in.
Every answer cites its sources
No source, no claim. You can click straight to the passage it used.
Every exchange is logged
Question, answer, and sources written to an immutable audit trail.
It all runs locally
docker compose up on your own box. No external API anywhere in the answer path.

Honest status: the core is real and runs today, extracted from a self-hosted fleet I operate myself. The product packaging is early, connectors (SharePoint, core systems), agents, per-task models, and multi-tenant deployment come next. It's ready for a pilot, not a press release.

Where it matters

The rooms where data can't leave.

It matters most where confidential data can't cross the perimeter, regulated, data-sensitive work across the EU and DACH.

Regulated finance (MiFID II · DORA)InsuranceHealthcareLegal & professional servicesData-sensitive SMEsPublic sector

One of the systems I build.

If you need useful AI over data that cannot leave your perimeter, we can scope a pilot around one document set and one measurable workflow.

Discuss a pilot

Questions

Straight answers.

Does any of our data leave our infrastructure?

No. Models, retrieval, and storage all run locally on hardware you control. There is no external API in the answer path unless you explicitly choose to add one.

Do we need to buy a special appliance?

No. It is software. It runs on your existing servers, a private cloud (VPC), or a single GPU box you already own.

How is this different from a cloud AI assistant?

A cloud assistant sends your data to a third-party provider. This runs entirely inside your perimeter. The trade-off is that you host it, and for regulated data, that's the entire point.

Is it finished?

The core is real and runs today; the product packaging is early. The honest answer is that it's ready for a pilot, not a press release, which is exactly when a design partner gets the most influence over it.

What models does it use?

Open-weight models (Llama, Qwen, and similar) served via Ollama. You can swap them at any time, you are never locked to a single vendor or model.

Private AIthat never leaves the building.