Workload Identity

Envoy Sidecar Identity Injection

Maya Chen · · 12 min read
Envoy Sidecar Identity Injection

Envoy proxy, as a sidecar, intercepts all inbound and outbound traffic from the application container. This is the fundamental property that makes service mesh architectures work — the proxy is transparent to the application, sitting between the application and the network, able to observe and transform every packet that goes in or out. That same interception point is where workload identity can be injected into outbound requests without any application code changes.

The idea: if every outbound HTTP request from a pod passes through an Envoy sidecar, the sidecar can attach an authentication header derived from the pod's workload identity before the request leaves the pod network namespace. The application sends an unauthenticated request to an internal endpoint. The sidecar transparently adds an Authorization: Bearer header containing a short-lived token representing the pod's SPIFFE identity. The destination service validates the token and grants or denies access based on the workload's identity, not a credential the application was configured with.

How Envoy Sees Traffic

In a Kubernetes pod with an Envoy sidecar (whether injected by Istio, by a custom admission webhook, or by Daemonset), iptables rules are configured to redirect traffic through the sidecar:

  • Inbound TCP traffic on all ports (except port 15090, the Envoy admin port) is redirected to Envoy's inbound listener on port 15006.
  • Outbound TCP traffic is redirected to Envoy's outbound listener on port 15001.
  • The Envoy process, running as a different user than the application, is exempt from these iptables rules, so its own traffic goes directly to the network.

The application connects to, say, https://api.internal:8443. The outbound TCP connection is intercepted by iptables and redirected to Envoy's port 15001. Envoy receives the original destination address from the iptables socket option, opens its own connection to api.internal:8443, and proxies the traffic — with whatever transformations are configured, including adding headers, establishing mTLS, and inserting credential tokens.

JWT Injection via HTTP Filter

Envoy's HTTP filter chain processes HTTP/HTTPS traffic through a sequence of filters before forwarding. Relevant filters for identity injection:

lua filter — inline Lua scripts that can modify request/response headers, set custom headers based on external calls, or perform lightweight request transformation. Suitable for simple header injection from a value computed at filter time.

ext_authz filter — delegates authorization (and optionally request transformation) to an external authorization service. The filter sends request attributes to the external service, which returns an authorization decision and optionally a set of headers to add. This is the right approach for dynamic, per-request identity token injection because the external service can fetch or refresh a token, validate its TTL, and return it without blocking the main filter chain.

header_to_metadata filter — extracts values from request headers into Envoy's metadata, which can then be used by downstream filters. Useful for propagating identity context through the filter chain.

A minimal Envoy configuration for outbound JWT injection using the ext_authz filter looks like this in xDS proto format:

http_filters:
- name: envoy.filters.http.ext_authz
  typed_config:
    "@type": type.googleapis.com/envoy.extensions.filters.http.ext_authz.v3.ExtAuthz
    grpc_service:
      envoy_grpc:
        cluster_name: identity_injector
      timeout: 0.25s
    include_peer_certificate: true
    with_request_body:
      max_request_bytes: 0
      allow_partial_message: true
    failure_mode_allow: false
    transport_api_version: V3
    clear_route_cache: false

The identity_injector cluster points to a local service (typically running as another container in the pod, or as the Aembit agent) that implements the gRPC ExtAuthz API. When a request comes through, Envoy calls this service with request attributes. The service fetches or retrieves a cached SVID, constructs a JWT-SVID for the request's target audience, and returns it to Envoy with instructions to add the Authorization: Bearer <token> header. Envoy adds the header and forwards the request.

The Token Audience Problem

For the receiving service to validate the injected JWT, it needs to know which audience to expect in the token's aud claim. If the token's audience is payments.internal but the receiving service validates against api.internal, the validation fails. The audience needs to match what the receiving service expects.

There are two approaches to audience resolution in sidecar injection:

Destination-based audience: The sidecar uses the request's destination (extracted from the original destination address in the iptables redirect) to look up the expected audience for that endpoint. This requires a registry mapping endpoints to their expected JWT audiences. Maintainable for internal services you control; harder for external endpoints that don't publish their JWT audience.

URL-derived audience: Construct the audience from the destination host and path. Many services that accept JWT tokens allow you to configure the expected audience, so you can define the audience as the service's base URL (https://api.internal:8443) and configure the receiving service to validate against the same string. This is simpler to set up and requires no external registry.

Istio handles this automatically for intra-mesh traffic — the audience for Istio-issued JWT-SVIDs is the destination's SPIFFE ID. For cross-mesh or extra-cluster traffic, you need to handle audience resolution explicitly.

Injecting Credentials for Non-HTTP Protocols

Envoy's HTTP filter chain only operates on HTTP/HTTPS traffic. For other protocols — database wire protocols, gRPC over non-HTTP, proprietary binary protocols — the layer-7 filter chain isn't available. Envoy does have network filters (operating at the TCP level) and can do TLS termination and origination for any TCP connection, but it can't inject application-layer credentials into non-HTTP protocols without protocol-specific filter support.

For database connections specifically, the options are:

  • Envoy's MySQL proxy filter and PostgreSQL proxy filter, which can parse the MySQL/PostgreSQL wire protocols and potentially inject authentication data — but these filters have limited functionality and are not production-ready for credential injection use cases.
  • Protocol-level proxying: a smart proxy (like PgBouncer with IAM auth support, or a custom proxy) that understands the database protocol, accepts connections from the sidecar, and authenticates to the database using credentials fetched just-in-time from a credential service.
  • Application-level integration: the application code fetches credentials from the Workload API or a local credential service, rather than relying on the sidecar for injection. This is more intrusive but more reliable for non-HTTP protocols.

This is the honest tradeoff: Envoy sidecars work well for HTTP/HTTPS workload identity injection with zero application changes. For other protocols, you need either protocol-specific sidecar support or application-level integration. We're not saying the sidecar pattern is universal — it has a clear protocol boundary.

SPIFFE Workload API Integration

For an Envoy-based sidecar to inject SPIFFE-derived identity tokens, it needs access to the Workload API. In a SPIRE-equipped cluster, the Workload API socket is available at a well-known path mounted into every pod (via a HostPath volume or CSI driver). The identity injector service running alongside Envoy opens the Workload API socket and streams SVIDs, rotating them before they expire.

The sequence for an outbound HTTP request with SPIFFE-based JWT injection:

  1. Application makes HTTP request to downstream API.
  2. Outbound iptables redirect sends the TCP connection to Envoy port 15001.
  3. Envoy's ext_authz filter sends the request attributes (destination host, path, headers) to the identity injector.
  4. Identity injector looks up the current JWT-SVID for the target audience. If cached and not near expiry, returns it immediately. If expired or near expiry, fetches a fresh one from the Workload API.
  5. Identity injector returns an authorization decision to Envoy with Authorization: Bearer <jwt-svid> as an injected header.
  6. Envoy adds the header and forwards the request to the destination.
  7. Destination validates the JWT-SVID signature, checks expiry, checks audience, extracts the SPIFFE ID from the sub claim, and makes an authorization decision.

Round-trip time for the ext_authz call to a local identity injector is typically under 1ms (local socket, no network hop). For cached SVIDs, this is a memory lookup. The latency impact is negligible for most workloads.

What Changes at the Application Layer

With sidecar-based identity injection, the application sends HTTP requests with no authentication header (or with a placeholder that the sidecar replaces). The downstream service receives requests with a valid authentication header that was added by the proxy layer.

This is transparent to the application code — it makes HTTP calls the same way it always did. No SDK changes, no credential loading code, no token renewal logic. The application doesn't know or need to know that its requests are being authenticated by the sidecar.

The tradeoff: the sidecar is a new failure mode. If the sidecar's connection to the Workload API fails, or if the ext_authz call times out, the request behavior depends on your failure_mode_allow setting. With failure_mode_allow: false (default), requests fail closed — the sidecar blocks requests rather than forwarding unauthenticated ones. With failure_mode_allow: true, requests pass through without the authentication header, which may be appropriate for lower-security paths or development environments.

The sidecar pattern is the right approach for Kubernetes workloads where you want workload identity without touching application code. For environments where sidecar injection isn't feasible — serverless, non-Kubernetes, highly latency-sensitive workloads where the ext_authz hop matters — the SDK integration path is more appropriate. Both approaches achieve the same end state: workloads authenticating with ephemeral, attested identity rather than stored secrets.