AI Security

AI Agents Need Identity

David Goldschlag · Aug 2, 2024 · 10 min read

When your team deploys a LangChain agent that queries an internal API for customer data, the agent makes an HTTP request. That request has an Authorization header. What's in it? In most early AI deployments we've looked at, the answer is: a static API key that was pasted into a .env file or a Kubernetes Secret, shared across every instance of that agent, with no expiry and no audit trail attributing specific agent actions to specific credentials.

AI agents have a workload identity problem that's structurally identical to the service account problem — except the blast radius is larger, because agents are designed to make decisions autonomously and call multiple downstream systems in a single execution. A compromised shared API key that an agent uses doesn't just expose one API call; it exposes the agent's full access scope for an indefinite period.

What Makes AI Agents Different From Regular Services

A traditional microservice has a stable, predictable call graph. Service A calls services B, C, and D. The set of credentials it needs is known at deployment time, and you can scope those credentials precisely because you know what each service does.

An AI agent's call graph is determined at runtime by the model. A customer support agent might call your CRM to look up account history, then call a pricing API to check current rates, then call a ticketing system to create a case — all in a single execution, based on the content of the user's question. The agent's tool list defines the potential calls; the model determines which ones actually happen.

This creates a policy question that doesn't exist for traditional services: when should this agent be allowed to call which tools, with what scope, for which user's data? The static "this service account can read CRM records" authorization model doesn't capture the contextual authorization an agent needs — authorization that should depend on which user the agent is acting for, what action is being taken, whether that action is within the expected scope for this type of request.

The Shared API Key Pattern and Its Problems

The common deployment pattern for agents with tool access today:

CUSTOMER_DATA_API_KEY=sk_prod_abc123...
TICKETING_API_KEY=Bearer tk_live_xyz789...
PRICING_API_KEY=pk_prod_def456...

These get loaded into the agent process environment or pulled from a secrets manager at startup. All agent instances share the same keys. Every call the agent makes to the customer data API uses the same credential, regardless of which user the agent is serving, which workflow triggered it, or whether the access was appropriate.

The problems with this pattern:

No per-agent-instance attribution. When the customer data API logs requests authenticated with sk_prod_abc123, it sees one credential making many requests. If an agent instance runs a prompt injection attack and exfiltrates data, the audit log shows the credential, not which agent invocation was compromised. Forensic reconstruction becomes extremely difficult.

Scope drift. Agents gain access to tools incrementally as developers add capabilities. Each new tool adds credentials to the shared environment. The agent that started as "looks up account info" now also has write access to the ticketing system, read access to pricing data, and a webhook credential for a notification service. Nobody went back to scope the original credentials more narrowly when the agent's responsibilities changed.

Cross-user contamination risk. If you're running a shared agent that serves multiple users, the same credential is used for all of them. A bug in the agent's data scoping logic — or a successful prompt injection — can cause the agent to access one user's data using a credential that was used moments before for a different user. There's no credential-level signal of which user's data access was appropriate.

What Agent Identity Should Look Like

The mental model we find useful: an AI agent is a workload that should have its own attested identity, the same as any other service in your stack. The difference from traditional services is that agents often act on behalf of users, which means agent identity needs to carry both workload identity (what is this agent?) and delegation context (whose request is this agent executing?).

For workload identity, the same mechanisms that work for regular services work for agents: SPIFFE SVIDs if you're running agents in Kubernetes, OIDC-based identity for agents running in cloud functions, or process-attestation for agents running in VMs. The agent runtime (the Python process running LangChain, for example) can be attested the same way any other process can be.

For delegation context, the natural carrier is the OIDC token that your authentication layer issues to the human user making the request. When a user authenticates to your application and triggers an agent workflow, the user's identity token can be forwarded to the agent, which can include it in downstream API calls as evidence that it's acting on behalf of a specific user. This is the "on behalf of" (OBO) flow used in OAuth 2.0 token exchange (RFC 8693).

Combined, a properly structured agent call to a downstream API carries: the agent's own workload identity (proving this is an authorized agent deployment, not arbitrary code), the user's identity (proving who the agent is acting for), and the requested scope (which data, which action). The downstream API can make meaningful authorization decisions based on all three.

Implementing Per-Agent Credential Isolation

A practical intermediate step, before full workload identity integration, is per-agent-instance credential isolation. Rather than all agent instances sharing one API key per tool, each agent instance gets a unique, short-lived credential that's scoped to that instance's execution context.

For tools that support OAuth 2.0 client credentials or API key issuance via a management API, you can implement this as a startup hook:

async def get_agent_credentials(agent_instance_id: str, user_context: dict) -> dict:
    """Request per-execution credentials from the credential service."""
    async with httpx.AsyncClient() as client:
        response = await client.post(
            "https://credentials.internal/issue",
            json={
                "agent_id": agent_instance_id,
                "user_id": user_context["user_id"],
                "tools": ["customer_data_api", "ticketing_api"],
                "ttl_seconds": 300,  # 5 minutes, covering expected agent execution time
            },
            headers={"Authorization": f"Bearer {agent_workload_token}"}
        )
        return response.json()["credentials"]

The credentials returned are unique to this agent invocation, expire after 5 minutes (long enough for the execution, short enough to limit breach exposure), and carry the user_id as context that downstream services can log and audit. The credential service receives the agent's workload token (agent_workload_token) as proof of identity before issuing the per-execution credentials.

This isn't perfect — you're still issuing credentials to the agent, just scoped shorter. But it gives you attribution in audit logs, automatic expiry that limits the breach window, and a place to inject authorization logic (the credential service can deny issuance if the user doesn't have permission for these tools).

Policy Enforcement at the Credential Layer

The credential issuance step is where meaningful policy enforcement happens. Questions you can answer at issuance time:

Is this user tier allowed to trigger this agent workflow? (e.g., some agents are restricted to certain subscription plans)
Has this user hit a rate limit for this tool category today?
Is the agent requesting access to a tool that's outside its declared tool list for this workflow?
Is the requested TTL within the expected bounds for this type of execution?

None of these questions can be answered at deployment time when you're using shared static API keys. They require a per-execution credential issuance step that has access to context about the current request.

This is the model we're building at Aembit: workload identity (the agent's attested identity) combined with contextual authorization policy that's evaluated per access request, not just per deployment. When an agent requests a credential for the customer data API, Aembit checks: is this agent's SPIFFE ID in the policy for this API? Is the user the agent is acting for allowed to access this endpoint? Is the requested operation within the allowed scope? If all conditions pass, Aembit issues a short-lived credential and logs the authorization decision with full attribution.

The Agentic Authorization Gap

The AI agent deployment trend is accelerating the non-human identity problem in ways that human IAM practitioners haven't fully grappled with. The headcount of non-human identities in a system that includes autonomous agents isn't 40+ service accounts — it's the product of agent types, user contexts, and tool combinations. An agent that can call 10 different APIs on behalf of users is potentially generating 10+ unique credential needs per user interaction.

We're not saying every team needs to implement full SPIFFE-based agent identity before deploying agents. We're saying that deploying agents with shared, non-expiring API keys and then building an "identity for agents" story later is the same decision path that led to the current service account sprawl problem — and the right time to correct the architecture is at the beginning, not after the deployment is in production and the credentials have proliferated.

The core question to ask for every tool in your agent's toolset: if this agent instance were compromised, what's the blast radius? If the answer is "everything this credential can access, indefinitely" — that's the architecture to fix.