Zero Trust for AI Agents: Least Privilege

Every security team understands least privilege for humans. Role-based access control. Need-to-know. Time-bound credentials. We've been doing it for decades. For AI agents, we've barely started.

The default posture for most AI agents today is implicit trust. The agent gets an API key with broad access. It can call any tool its framework exposes. It can access any data its host process can reach. It runs with the combined permissions of the developer who wrote it, the service account that hosts it, and the model provider it connects to.

This is the opposite of zero trust. And it creates a blast radius that most security teams have not calculated.

What agents have access to today

I want to make the current state concrete, because the abstraction hides the risk.

A typical AI agent built with LangChain, CrewAI, or AutoGen has access to whatever tools the developer gave it. In practice, this often means:

An API key for OpenAI or Anthropic with no budget cap
Database credentials with read and write access to production tables
An email sending capability with no recipient restrictions
File system access to the host machine's working directory (and often broader)
Network access to any internal service the host can reach
The ability to install packages, run shell commands, or spawn subprocesses

If a human employee had this level of access, we would flag it immediately. No single human role in a well-governed organization has unrestricted read-write database access, unrestricted email, unrestricted network access, and the ability to execute arbitrary commands. The principle of least privilege exists precisely to prevent this.

But we give it to AI agents routinely. Partly because the frameworks make it easy. Partly because the agent needs some of these capabilities to be useful. Partly because nobody has built the access control layer yet.

What least privilege means for agents

The principle is the same as for humans: grant the minimum access required to perform the task, for the minimum duration needed. The application is different because agents operate differently.

Tool scoping

An agent should only have access to the tools it needs. A claims processing agent needs access to the claims database and the policy lookup service. It does not need access to the HR system, the financial ledger, or the email service.

This sounds obvious. In practice, most agent frameworks expose tools as a flat list. The developer adds every tool the agent might ever need, and the model decides at runtime which ones to call. There is no concept of "this agent is only authorized to use these three tools." The model sees all tools and can call any of them.

Least privilege requires a layer that enforces tool authorization independently of the model's decisions. The model can request any tool call it wants. The authorization layer checks whether this specific agent is allowed to make this specific call, and rejects it if not.

Data scoping

An agent that processes insurance claims needs access to the claimant's data. It does not need access to all claimants' data. It does not need access to employee data. It certainly does not need access to the claims of other insurers if the system is multi-tenant.

Data scoping for agents is harder than for humans because agents don't have natural session boundaries the way humans do. A human logs in, works on specific records, and logs out. An agent processes requests continuously. Each request might involve different data scopes.

The solution is to scope data access per session, not per agent. Each session has a defined context: which records, which customer, which case. The agent can access data within that context. Access to data outside the context is denied. When the session ends, the access expires.

Model scoping

Not every agent should use every model. A customer-facing chatbot might be restricted to a specific model version that has been evaluated for the use case. An internal analytics agent might be allowed to use a wider range of models because the risk profile is different.

This also prevents model substitution attacks, where an agent is tricked into routing requests to a less restricted model. If the agent is authorized to call GPT-4 and nothing else, a prompt injection that instructs it to "use the more capable model" will fail at the policy layer.

Budget scoping

Every agent should have a budget. Not a suggested budget. A hard limit. Enforced per session and per time window.

The budget prevents runaway agents from consuming unlimited resources. But it also serves as a security control. Many prompt injection attacks work by inducing the agent to perform repetitive operations. A session budget caps the damage. If the agent is limited to 1,000 tokens per session and an attack tries to consume 100,000, the session terminates after 1,000.

Temporal scoping

Some agents should only operate during certain hours. A reporting agent that generates daily summaries should not be active at midnight. A customer service agent should not process requests outside business hours if there's no human available to oversee it.

Temporal scoping is the least commonly implemented control but one of the most practical. Many incidents happen outside business hours precisely because there's less oversight. Time-based restrictions reduce the window of unsupervised operation.

The identity problem

None of this works without agent identity. You can't enforce per-agent policies if you can't tell agents apart.

Most agents authenticate with shared API keys. Every agent on the team uses the same OpenAI key. Every agent on the platform uses the same service account. From the governance layer's perspective, they're all the same entity.

Zero trust requires individual identity. Each agent needs its own credential. Not just for authentication but for authorization. The credential encodes who the agent is, which allows the policy engine to determine what the agent is allowed to do.

In traditional infrastructure, this is solved by SPIFFE: the Secure Production Identity Framework for Everyone. Each workload gets a cryptographic identity (an X.509 certificate or a JWT) that attests to what it is, not who created it. The identity is short-lived and automatically rotated. It can be revoked instantly.

The same model applies to AI agents. Each agent gets a cryptographic identity. Each identity maps to a set of permissions: which tools, which data, which models, which budgets. When the agent makes a request, its identity is verified and its permissions are checked. No identity, no access. Wrong identity for the requested action, no access.

This is not a new idea. It's just not being applied to this new category of workload yet.

The practical challenge

I want to be honest about why this isn't happening faster. It's not that security teams don't understand the principle. It's that the tooling makes it hard.

Agent frameworks are built for developer productivity, not security governance. Adding a tool to an agent is one line of code. Restricting which tools an agent can use requires building custom middleware that most teams don't have time to build.

Model provider APIs authenticate with API keys, not with workload identities. There's no standard way to say "this API key belongs to this specific agent and should have these specific limits." The key is the key.

The governance layer that sits between the agent and the model and enforces tool authorization, data scoping, budget limits, and temporal restrictions doesn't exist in most agent frameworks. It needs to be added as external infrastructure. That's work. It requires planning, investment, and organizational commitment.

But the alternative is worse. The alternative is agents with implicit trust, broad access, and no blast radius containment. In a world where those agents are processing regulated data and making consequential decisions, the alternative is unacceptable.

Zero trust for AI agents is not a new security principle. It's an existing security principle that hasn't been applied yet. The gap is tooling and habit, not theory. Every CISO already knows that broad access is bad. The task is to make least privilege practical for a new category of system that is deploying faster than the security controls can keep up.

See governance at runtime

TapPass is in private beta. If your team is shipping AI agents, we'd rather get you on the product than in a pipeline.

Request beta access More reading

Zero Trust for AI Agents: What Least Privilege Actually Means