AI Security

Why AI Agents Fail Security Audits

David Goldschlag · Dec 9, 2025 · 11 min read

SOC 2 and ISO 27001 auditors are adding AI agent questions to their standard evidence requests. The question usually comes in via an RFI: "Describe the authentication and access control mechanisms for any AI or automated agents deployed in your environment." Most teams respond with something about their LLM provider's API key management, which misses the point entirely. The auditor is asking about what the agents do after they get their instructions — what systems they access, with what credentials, and whether there's a way to trace access events back to specific agent invocations.

These audits reveal the same three failure modes consistently: shared credentials used by multiple agent types, no per-agent audit trail, and no defined scope for what an agent is authorized to access. This post breaks down why each failure matters from an audit perspective and what the passing answer looks like.

Failure Mode 1: Shared Credentials Across Agent Types

The most common pattern: a single API key or service account that all AI agents use for accessing downstream systems. The orchestrator, the retrieval agent, the tool-execution agent — they all authenticate as the same identity when calling external services or internal APIs.

From an audit perspective, this fails on two grounds.

First, it violates the principle of least privilege. If the orchestrator needs read access to documents and the execution agent needs write access to a workflow system, a shared credential must hold both sets of permissions. When auditors ask "what is the minimum necessary access for each component of your AI system?", the answer "one credential covers everything" is not a control — it's an absence of control.

Second, it eliminates attribution. When auditors ask for evidence that you can detect and investigate unauthorized access, you need to show that access logs are actionable. Logs showing "the shared service account accessed the customer database" don't tell you which agent invocation triggered it, which user request drove that invocation, or whether the access was expected behavior or anomalous. In an incident scenario, that's the difference between a 2-hour investigation and a 2-week one.

The passing answer: per-agent-type workload identities using SPIFFE-based attestation. Each agent type has its own SPIFFE ID, its own access policy, and its own credential set. The audit trail shows which agent type accessed what, and access policies explicitly define what each agent type is authorized to do.

Failure Mode 2: No Verifiable Audit Trail

Teams often have logging — the LLM API calls are logged, the tool invocations are logged, the application database queries are logged. But "logging exists" is different from "an auditable trail exists."

An auditable trail needs to answer: for this specific access event, which identity made it, what authorized it, and can I correlate it to the originating request? Most AI agent logging falls short on the second and third questions.

What authorized it: the access log should include the policy or access control decision that permitted the action. For workload identity systems, this is the policy evaluation record — which access policy matched, which conditions were satisfied, and what credential was issued as a result. Without this, an auditor asking "how do you know this access was authorized?" gets "it happened, so it must have been" — which is not a satisfying answer.

Correlation to originating request: a pipeline that runs autonomously based on an upstream trigger needs a correlation ID that connects every access event in that pipeline run to the triggering event. If the pipeline run ID isn't in the access log, you can't reconstruct which customer request caused which database query — which makes forensic investigation impossible and compliance demonstrations difficult.

The passing answer: structured audit logs per access event that include the agent's SPIFFE ID, the policy ID that authorized access, the pipeline run ID for correlation, the resource accessed, and the operation performed. These logs should be immutable and retained for at least the period specified in your retention policy (typically 12-24 months for SOC 2 compliance).

Failure Mode 3: Undefined or Over-Broad Access Scope

When auditors ask "what can this AI agent access?", the answer "anything the system needs" is a finding. Access scope needs to be defined before deployment, not inferred from what the agent happens to do in production.

This failure mode often stems from how AI agent systems are built: developers grant the agent enough access to complete a task, discover mid-development that it needs more, add permissions incrementally, and never go back to right-size the permissions when the development phase is complete. The result is a production agent with permissions that reflect the maximum it ever needed during testing, not the minimum it needs in production.

For auditors, the question is specifically whether you can demonstrate that access scope is bounded and documented. A service account with read/write to your entire database cluster is not a bounded scope — even if the agent only reads in practice. The access control system should enforce the boundary; "we trust the agent not to exceed its scope" is not a control.

The passing answer: access policies with explicit server workload definitions. In Aembit's model, every agent type has a defined set of server workloads it can request credentials for, and each server workload definition scopes the credential to the minimum necessary access. An auditor can look at the policy configuration and read off exactly what each agent type can and cannot access. Attempting to access anything outside the defined server workloads is denied at the policy layer, not just absent from the agent's behavior.

What Auditors Actually Look For

The evidence request for AI agent access control typically asks for:

Inventory of AI agents in production: What agents are deployed, what they do, which systems they access
Authentication mechanism documentation: How agents authenticate to downstream systems; whether credentials are per-agent or shared
Access policy documentation: The defined scope of what each agent can access, in a machine-readable or auditable form
Sample audit logs: Evidence that access events are logged with sufficient detail to investigate anomalies
Incident response procedure: What happens if an agent accesses something it shouldn't, or if a credential used by an agent is suspected compromised

Items 1-3 are often missing entirely. Item 4 exists in some form but lacks the correlation and attribution detail needed for it to be useful. Item 5 is frequently the IAM key revocation procedure copy-pasted into an AI agent section without adjustment for the actual agent architecture.

Preparing the Evidence Package

If your audit is approaching and your AI agents don't currently have per-agent identity, here's what you can produce in the short term while building toward the full solution:

Agent inventory document: A table of deployed agents with name, function, downstream systems accessed, credential type used, and owner team. Even if all agents share a credential today, having this inventory documented demonstrates awareness and control of the environment.

Intended access scope per agent: For each agent type, define what it should be able to access. Even if enforcement isn't yet in place, having the definition documented shows that the scope was intentionally designed. Note it as "defined but not yet enforced by technical control, relying on code-level restriction currently, with technical enforcement planned for [date]." Auditors generally accept honest roadmaps for controls in progress.

Log sample with correlation: Pull a sample of 10-20 access events from your agent system and annotate them with the correlation to the originating request. Even if this correlation requires manual reconstruction for the sample, it demonstrates that the information exists and can be recovered — and it gives you a baseline for building automated correlation.

Incident response addendum: Add an AI agent section to your incident response procedure that specifically covers: what to do if a shared agent credential is suspected compromised (rotation procedure, scope of potential blast radius), how to isolate a misbehaving agent (kill switch, network isolation), and who is responsible for investigating access anomalies from AI agents.

The Faster Path: Build It Right From the Start

For teams building new AI agent systems rather than retrofitting existing ones, the path to audit-readiness is easier: start with per-agent-type SPIFFE identities, define access policies before writing agent code, and build correlation IDs into the pipeline run structure from the beginning. These decisions add a few days of setup work and essentially zero ongoing overhead — the identity layer operates transparently once configured.

We're not suggesting that every team needs to pass a SOC 2 audit tomorrow. We're saying the audit questions are good questions: who is accessing what, with what authority, and can you tell when something goes wrong? Building AI agent systems that can answer those questions clearly is good engineering regardless of whether an auditor ever asks.

The teams that fail audits on AI agent authentication aren't doing anything malicious — they built quickly, they were focused on the ML and product problems, and the identity infrastructure wasn't in scope during the development sprint. The problem is that "not in scope during development" becomes a compliance finding and a real security gap once the system is handling production workloads and sensitive data.