Workload Identity

Lambda Identity Beyond Execution Roles

Raj Patel · Sep 26, 2024 · 11 min read

AWS Lambda execution roles are well understood: you attach an IAM role to a function, and the Lambda service injects temporary STS credentials into the function's environment via the instance metadata service. Code that uses the AWS SDK picks these up automatically via the credential provider chain. This part works well and is genuinely better than storing AWS credentials in Lambda environment variables.

The gap is everything else. When your Lambda function calls your own REST API, connects to a non-RDS PostgreSQL instance, posts to a Slack webhook, or makes any call outside the AWS SDK's credential chain — there's no equivalent automatic identity mechanism. These calls authenticate with whatever credentials you've loaded into environment variables or SSM Parameter Store, and those are static, long-lived, and shared across all invocations of that function.

What the Execution Role Actually Covers

A Lambda execution role provides temporary IAM credentials. The function can use these to call AWS services: read from S3, write to DynamoDB, publish to SNS, query RDS (if you're using RDS IAM authentication), invoke other Lambda functions, and so on. The SDK credential provider chain resolves the credentials from the execution environment automatically — you write boto3.client('s3') and the credentials are handled.

The scope boundary: execution roles are an IAM concept, and IAM is an AWS concept. They don't extend to non-AWS systems. When your function calls an API endpoint that runs on your own infrastructure, that endpoint has no way to ask AWS "is this request from a legitimate Lambda invocation?" without additional integration work.

Concretely: if you have a Lambda that calls https://api.internal.example.com/data and that API needs to know whether the caller is authorized, it can't verify the IAM execution role identity of the Lambda. It can only validate whatever credential the Lambda sends in the request header — and that credential is typically a static API key stored in Lambda's environment variables.

Lambda's Identity Landscape

Lambda functions actually have several identity artifacts available at runtime. Most developers know about the execution role credentials. Fewer know about the function URL JWT or Lambda's OIDC token capability.

Lambda OIDC tokens (available via Lambda Resource-Based Policies): Lambda can generate signed OIDC tokens that identify the function. These are available via the runtime API and can be used with services that accept OIDC identity federation — specifically, you can use a Lambda-issued OIDC token to assume an IAM role in another AWS account, using the same AssumeRoleWithWebIdentity pattern used for GitHub Actions and Kubernetes.

Lambda Function URLs with IAM auth: If your API runs on Lambda and exposes a function URL with AuthType: AWS_IAM, the calling Lambda's execution role can sign requests with SigV4, and the receiving Lambda can validate the IAM identity. This works for Lambda-to-Lambda authentication within the same or different accounts, but only if both sides are Lambda functions with function URLs — it doesn't extend to non-Lambda APIs.

Environment variables from SSM/Secrets Manager: The standard pattern for non-AWS credentials. Lambda pulls secrets at cold start (or runtime via SDK calls) from Parameter Store or Secrets Manager. These are static credentials — they're fetched fresh each cold start, but the underlying secret is static until it's manually rotated or automatic rotation fires.

Cross-Account Authentication

A common Lambda identity challenge: a function in account A needs to access resources in account B. The IAM-native way to handle this is cross-account role assumption:

import boto3

def assume_cross_account_role(role_arn: str, session_name: str) -> dict:
    sts = boto3.client('sts')
    response = sts.assume_role(
        RoleArn=role_arn,
        RoleSessionName=session_name,
        DurationSeconds=900  # 15 minutes, minimum for AssumeRole
    )
    return response['Credentials']

def handler(event, context):
    creds = assume_cross_account_role(
        role_arn='arn:aws:iam::999888777666:role/cross-account-data-reader',
        session_name=f"lambda-{context.aws_request_id}"
    )

    s3 = boto3.client(
        's3',
        aws_access_key_id=creds['AccessKeyId'],
        aws_secret_access_key=creds['SecretAccessKey'],
        aws_session_token=creds['SessionToken']
    )
    # ...

Using context.aws_request_id as the session name gives you per-invocation attribution in CloudTrail in account B — every cross-account access log entry will show which specific Lambda invocation made the call.

The trust policy in account B's role needs to allow account A's Lambda execution role to assume it:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": {
      "AWS": "arn:aws:iam::111222333444:role/my-lambda-execution-role"
    },
    "Action": "sts:AssumeRole",
    "Condition": {
      "StringEquals": {
        "sts:ExternalId": "prod-data-access-2024"
      }
    }
  }]
}

The ExternalId condition is optional but valuable — it prevents the confused deputy problem where a compromised service in account A could attempt to impersonate the Lambda to assume the cross-account role.

The Non-AWS Credential Problem in Detail

Consider a Lambda function in a data processing pipeline. It pulls events from SQS (execution role covers this), transforms them, and writes results to three places: a DynamoDB table (execution role), a customer's external webhook (API key in environment variable), and an internal PostgreSQL database (connection string in Secrets Manager).

Two of three credential types are well-handled by the execution role. The webhook API key and the database password are not. They're static secrets that live in Lambda's configuration and are shared across all invocations. The webhook API key in particular is often customer-specific — meaning the same Lambda is handling events for multiple customers and authenticating with different per-customer keys, all stored as environment variables.

Rotating these credentials requires: updating the secret in Secrets Manager or Parameter Store, deploying a new Lambda version (or using Secrets Manager rotation with a custom rotation Lambda), and ensuring no invocations are in flight using the old credential during cutover. This is automatable but requires building the automation — it's not handled by any AWS primitive the way execution role rotation is.

Using Lambda Extensions for Identity

Lambda Extensions run as a separate process alongside the function runtime. They're the right place to implement credential vending that's separate from the function code itself — the extension handles credential acquisition and rotation, the function code consumes credentials from a local endpoint.

The pattern for an identity extension:

-- Extension process --
1. On init: register with Lambda Extensions API, fetch initial credentials from external credential service using execution role identity as proof of function identity
2. Serve credentials on local HTTP port (e.g., http://localhost:2773/credentials)
3. Before each invocation: refresh credentials if TTL is below threshold
4. Function code reads credentials from local endpoint, not from env vars

-- Function code --
import httpx

def get_api_key() -> str:
    response = httpx.get("http://localhost:2773/credentials/external-api")
    return response.json()["api_key"]

This decouples the function code from credential management, lets the extension use the Lambda execution role identity to authenticate to the credential service, and gives you a place to implement short-lived credential issuance without changing the function code. The credential service receives the Lambda's execution role identity (provably from AWS, via the IAM token it can validate) and can make authorization decisions based on which function is requesting credentials.

This is also where Aembit's Lambda integration operates. The Lambda function authenticates to Aembit using its execution role identity (which Aembit can verify against AWS's STS GetCallerIdentity API). Aembit checks whether that function identity is authorized to access the requested resource, and if so, issues a short-lived credential — an API token, database password, or other credential type — scoped to that invocation's needs. The function code uses the credential and discards it. No static secrets in the function environment.

Cold Start Implications

One practical concern with just-in-time credential fetching in Lambda: cold start latency. If your function needs to make an HTTP call to fetch credentials before it can do any work, that adds to the cold start time, which is already a sensitivity area for latency-critical applications.

Mitigations:

Use Provisioned Concurrency for latency-sensitive functions, pre-warming them and amortizing the credential fetch across many invocations.
Cache credentials in the extension between invocations (within the execution environment lifetime), only refetching when the TTL approaches expiry.
For execution environments that are reused frequently (high-traffic functions), the credential fetch happens once per environment lifetime, not per invocation.
Parallelize credential fetching with other init-time work where possible.

For most functions, a single HTTP call to a local credential endpoint at cold start (a few milliseconds if the credential service is in the same region) is acceptable. For functions where even milliseconds matter, Provisioned Concurrency eliminates the cold start entirely.

The Right Security Model for Serverless Identity

The execution role handles Lambda's identity within the AWS ecosystem well. The pattern to extend for non-AWS resources: the execution role identity as the root of trust, exchanged for short-lived credentials to specific external resources via a credential service that understands which Lambda functions are authorized to access which resources.

The function never holds a static secret for any external resource. It holds an execution role identity that AWS manages and rotates automatically, and it uses that identity to fetch ephemeral credentials for everything else at the moment it needs them. The credential service is the authorization point — it's where the policy lives for which function can reach which external system.

This is a meaningfully better security posture than environment variables full of API keys, and it's achievable without changing the function's business logic. The credential fetching can be encapsulated in a Lambda Extension or an SDK wrapper, keeping the function code clean and the security properties consistent across all functions in the environment.