Home / Blog / Security Guide

Security Guide

Your AI Agent Gets Hijacked by Stolen Tokens, Not Prompts

RRogue AI·2026-06-27·9 min read

A glowing access token being lifted away from an AI agent’s hand, representing session token theft

An AI agent does not have a password for an attacker to phish. It has something worse: a long-lived token it presents on every call to your email, your repos, your cloud, your databases. Steal that token and you do not need the password, the second factor, or the agent itself. You just replay the session, and every system the agent could reach greets you as the agent. The biggest credential dump of 2026, roughly sixteen billion records pieced together from infostealer logs, was never really about passwords. It was about sessions. And the least watched, most over-privileged session in your stack is the one your agent holds.

This is the attack class that does not show up in your prompt-injection threat model or your model-evaluation suite, because the model is not involved. Nobody jailbreaks the agent. Nobody crafts a malicious instruction. An infostealer on a developer laptop, or a poisoned dependency on a build box, lifts a token from disk or memory, and the agent’s identity walks out the door intact. Here is why agents are the perfect victim, what already happened in 2026, and the controls that actually work once you accept that the credential is the asset.

The password stopped being the target

Attackers moved up the stack. The unit of theft is no longer the password, it is the authenticated session: the cookie or bearer token that proves you already passed the login. That artifact lives past the multi-factor prompt, so stealing it skips authentication entirely. The sixteen-billion-record compilation that made headlines in June 2026 was a pile of infostealer output, and infostealer logs capture session cookies and access tokens alongside the passwords, which is the part that actually matters.

This is why multi-factor authentication keeps getting bypassed without anyone touching the second factor. An adversary-in-the-middle proxy, or malware reading a token off the host, captures the live session and replays it. The login already happened. The token is valid. Detection systems that watch for failed logins or new-password events see nothing, because nothing failed and nothing changed. For a human, this is bad. For an agent, it is the default operating condition.

Your AI agent is the perfect victim

An AI agent concentrates everything an attacker wants in one identity. It authenticates once and then holds its credentials for as long as the process runs. It carries broad scopes, because nobody wants to interrupt an autonomous workflow to re-consent. It runs unattended, often on a server, with no human watching the screen when the session is used at three in the morning. And it talks to many systems at once, so one stolen token is not one account, it is the blast radius of every tool the agent was wired to.

Compare that to a human session. A person logs in, works, closes the laptop, and the session goes idle in a way that looks suspicious if it suddenly reactivates from another country. An agent’s session is supposed to fire constantly, from a datacenter IP, against dozens of endpoints. The behavioral baseline that helps catch a hijacked human account is exactly the behavior a healthy agent produces, which means a hijacked agent hides inside its own normal traffic. As argued in every AI agent is a non-human identity, the agent is an account with credentials, and you have to secure the account, not the model.

This already happened in 2026

The agent-token attack is not theoretical, it shipped this year against real tooling. In 2026 Mitiga researchers showed that Claude Code’s OAuth tokens could be stolen through a stealthy redirect of its Model Context Protocol traffic: a classic man-in-the-middle that intercepts the token and keeps persistent access to whatever the assistant was connected to. The model was never tricked. The transport was.

The LiteLLM backdoor.In March 2026 two malicious releases of LiteLLM, a proxy with roughly ninety-five million monthly downloads, exfiltrated AWS tokens, GCP credentials, SSH keys, and Kubernetes configs from any machine that installed them. The poisoned versions sat on PyPI for about forty minutes and were pulled down some forty-seven thousand times in that window. Every install handed its agent infrastructure’s keys to a stranger.
Secrets sprawl in MCP configs.In the protocol’s first year of wide adoption, researchers found more than twenty-four thousand unique secrets sitting in Model Context Protocol configuration files. Those are long-lived tokens written to disk in plaintext, waiting for the first infostealer that reads the directory.
Full-lifecycle agent abuse.Between December 2025 and January 2026 a single operator used an AI assistant and its tools across an entire intrusion to breach six government agencies, which the World Economic Forum called the first confirmed AI-orchestrated espionage campaign. The agent was the operator’s hands, and its access was the prize.

These are not model failures, they are credential failures wearing an AI costume. OWASP put a name on the category in its Agentic Security Initiative as ASI03, identity and privilege abuse, precisely because the agent’s standing access is the thing worth stealing.

MFA was never the agent’s control

Multi-factor authentication protects a login event performed by a human who can tap a phone. An agent has no phone and performs no interactive login: it reads a token from a secret store or an environment variable and presents it. The factor was satisfied once, by whoever provisioned the credential, and never again. So when teams respond to credential theft by “adding MFA,” they harden the one door the agent does not use and leave the bearer token, the door it actually uses, sitting on disk.

The honest framing is uncomfortable: for a non-human identity, multi-factor authentication is close to irrelevant. What matters is how short-lived the token is, how narrowly it is scoped, where it is stored, and whether anything notices when it is used from the wrong place. Those are the levers, and almost nobody pulls them, because the agent “just works” with a static key and shipping beats hardening. This is the same lesson as leashing a coding agent: the danger is the standing access, not the intent.

Why device-bound cookies do not save the agent

The browser world has a real answer to session theft: Device Bound Session Credentials, which Google moved to general availability in Chrome 146 on Windows in 2026. DBSC ties a session to a non-exportable private key held in the machine’s Trusted Platform Module, and the browser signs a fresh challenge every few minutes to prove the original device is still present. A stolen cookie becomes useless on another machine because it cannot produce that signature.

That is excellent for human browsing and almost no help for a headless agent. DBSC assumes an interactive browser, a TPM, and a user session. Agents run server-side, in containers, across ephemeral hosts that may have no consistent secure element, and they authenticate machine-to-machine where the “device” is a fleet, not a laptop. The defense that is rescuing human sessions does not transfer cleanly to the place the risk is concentrating. Worse, as one researcher put it bluntly after DBSC shipped, device binding just pushes attackers toward running a hidden remote session on the victim host instead of stealing the cookie, so the live session itself becomes the target. For agents, the equivalent is simple: own the box, use the token in place.

How to actually protect an agent’s session

Treat the agent’s credential as the crown jewel, because to an attacker it is. The goal is to make a stolen token expire fast, reach little, and trip an alarm when it is used wrong. None of this requires a better model. It requires treating the agent as a privileged service account, which is what it is.

Short-lived tokens, no static keys. Issue credentials that live for minutes, not months, from a broker the agent calls at runtime. A token that has already expired by the time an infostealer ships it home is worth nothing.
Scope to the single job. One identity per agent, one set of permissions per task, no shared god-mode key across a fleet. A stolen token should unlock one mailbox or one repo, not the estate.
Default-deny egress. An agent that can only reach the three endpoints it needs cannot quietly ship a stolen token, or stolen data, anywhere else. Egress control turns a full compromise into a contained one.
Watch the session, not the login.Baseline where, when, and how the agent’s token is used, and alert on the deviation: a new ASN, an odd hour, a tool it never calls. The hijack shows up in usage, never in a failed login.
Keep secrets out of files. No tokens in MCP config, environment dumps, or repo history. Pull them from a vault at use time so there is nothing on disk for malware to read.

The shift in mindset is the whole game. Stop asking whether the model can be tricked and start asking what happens when the agent’s token is in someone else’s hands, because in the year of sixteen billion stolen sessions, that is the likely case, not the edge case. For the wider pattern, see securing self-hosted AI infrastructure and why prompt injection cannot be patched, since both come down to controls that live outside the model.

The quick test for any agent you run: if its token leaked right now, how long would it stay valid, how much could it reach, and would anything notice? If the answers are months, everything, and no, you do not have an AI security problem. You have a credential you forgot to manage.

Quick Reference

Human session theft vs AI agent token theft

Dimension	Human session	AI agent session
The credential	A cookie issued after an interactive login	A long-lived bearer token read from a secret store
MFA relevance	Protects the login the cookie came from	No interactive login happens, so MFA never applies
DBSC device binding	Works, the cookie is TPM-bound in the browser	Headless and ephemeral hosts, the model does not fit
Anomaly detection	An odd location or hour stands out	Constant datacenter traffic hides the hijack
Blast radius	One user account	Every tool the agent was wired to

Frequently Asked Questions

Can MFA protect an AI agent from session token theft?

Not in any meaningful way. Multi-factor authentication guards an interactive login performed by a human who can tap a phone or a hardware key. An AI agent performs no interactive login: it reads a token from a secret store or environment variable and presents it on every call. The factor was satisfied once, by whoever provisioned the credential, and is never checked again. Stealing the token skips the login entirely, so adding MFA hardens a door the agent does not use. For a non-human identity the controls that matter are short token lifetime, narrow scope, and detecting abnormal use, not a second factor.

How are AI agent tokens actually stolen?

Usually without touching the model at all. An infostealer on a developer laptop or build server reads tokens from disk or memory, including the thousands of secrets that end up in Model Context Protocol config files. Transport attacks work too: in 2026 Mitiga showed Claude Code OAuth tokens could be intercepted by redirecting its MCP traffic in a man-in-the-middle. Poisoned dependencies are a third route, as with the March 2026 LiteLLM backdoor that exfiltrated AWS, GCP, SSH, and Kubernetes credentials from every machine that installed two malicious releases.

Does Device Bound Session Credentials (DBSC) protect AI agents?

Barely. DBSC, which Google moved to general availability in Chrome 146 on Windows in 2026, ties a browser session to a non-exportable key in the device TPM so a stolen cookie cannot be replayed elsewhere. It is excellent for human browsing but assumes an interactive browser, a stable TPM, and a user session. Agents run server-side in containers across ephemeral hosts and authenticate machine-to-machine, where the device is a fleet rather than a laptop, so the protection does not transfer to where agent risk concentrates.

How do you secure an AI agent's session token?

Treat the agent as a privileged service account. Issue short-lived tokens from a broker at runtime instead of static keys, so a stolen token expires before it is useful. Scope each agent to one identity and the minimum permissions for its task. Apply default-deny egress so a compromised agent cannot exfiltrate the token or data. Baseline how the token is normally used and alert on deviations such as a new network, an odd hour, or an unfamiliar tool. Keep secrets out of files and pull them from a vault at use time.