90-day playbook

Static creds → workload identity.

If your credentials are static, rotating them faster doesn't close the audit finding — only workload identity does. Twelve weeks from "we have a vault" to "the workload proves who it is to the cloud, end-to-end". Closes the Vault Theatre anti-pattern.

Audience Platform / SRE / security engineer leading credential hygiene across the estate Pre-req HashiCorp Vault or cloud-native secrets manager exists. Cloud SSO works for humans. Some services use static service-account credentials.
End state Zero static long-lived credentials in production paths · workload identity (OIDC) for CI/CD, K8s, and serverless · remaining humans use SSO + PAM · auditable trail for every workload-to-cloud assertion.
Re-run diagnostic at week 13 DevSecOps Maturity diagnostic
Phase 1
Weeks 1–4

Inventory, classify, pick the wedge.

Phase 1 builds the credential map. Most teams discover they have 10× more static credentials than they thought, in places nobody's looked since the system was first set up.

Week 1

Inventory every credential path — CI · K8s · serverless · humans.

  • CI/CD: GitHub Actions secrets · GitLab CI variables · Jenkins credentials · CircleCI contexts
  • Production workloads: K8s service account tokens · Lambda env vars · Cloud Function env vars · VM instance profiles
  • Third-party SaaS: Datadog · PagerDuty · Slack apps · vendor integrations
  • Humans: IAM users · cloud-console logins · long-lived access tokens
Gate 1 · Credential register exists

Spreadsheet (or ITSM record). Every static credential listed: owner · purpose · age · last-rotated · privilege scope.

Week 2

Classify by blast-radius. Tackle highest first.

  • Tier 0 — sovereign destroyer: AWS root account · GCP org-admin · Azure global-admin · prod DB superuser. Should be near-zero already; verify
  • Tier 1 — production write: deploy keys with prod write · prod K8s admin tokens · DB write creds
  • Tier 2 — production read or non-prod write: monitoring read · staging admin
  • Tier 3 — low-privilege: dev-environment creds
Gate 2 · Every Tier 0/1 credential identified, with owner

Plan exists for retiring each. No silent unknowns at Tier 0/1.

Week 3

Pick the wedge pipeline / workload — one that proves the pattern.

  • Criteria: high blast-radius (so the win is meaningful) · simple enough to migrate cleanly · owned by a team that can champion the pattern
  • Document the “before”: credential path · rotation cadence · last incident · audit findings
Gate 3 · Wedge identified + signed

Engineering + security agree this is the right first migration. Success criteria documented.

Week 4

Choose the workload-identity stack per platform.

  • GitHub Actions ↔ AWS: OIDC + AWS IAM role trust policy
  • GitHub Actions ↔ GCP: Workload Identity Federation
  • GitHub Actions ↔ Azure: Federated Credentials
  • K8s pods ↔ AWS: IRSA (IAM Roles for Service Accounts) · Pod Identity
  • K8s pods ↔ GCP: Workload Identity
  • K8s pods ↔ Azure: Azure AD Workload Identity
  • Multi-cloud / on-prem: SPIFFE / SPIRE
Gate 4 · Stack chosen per platform with rationale

Decision documented; future migrations don't re-litigate.

Phase 2
Weeks 5–8

Migrate the wedge & document the pattern.

Phase 2 lands workload identity on the wedge pipeline end-to-end, then converts the experience into a repeatable paved-path template.

Week 5

Configure trust on the cloud side.

  • AWS: create IAM role with trust policy on GitHub OIDC provider · scoped to specific repo/branch (claim conditions)
  • GCP: create Workload Identity Pool + Provider · service account impersonation grant
  • Azure: create app registration · federated credential subject pinned to repo:branch
  • Test from a sandbox repo first. Verify token-claim conditions are tight enough (no broad repo:*)
Gate 5 · Trust policy live + audit-log verified

Test assertion from sandbox succeeds. Audit log shows the federated identity + claim conditions enforced.

Week 6

Migrate the wedge pipeline.

  • Replace static creds: remove AWS_ACCESS_KEY_ID / GCP_SA_KEY from secret store
  • Configure OIDC step: actions/configure-aws-credentials@v4 with role-to-assume · aws-region — no aws-access-key-id
  • Run the pipeline. Pipeline succeeds → credentials never touched a secret store
  • Delete the old static creds. Not "rotate one last time" — delete
Gate 6 · Wedge pipeline running on workload identity

Test deploy completed. Original static creds deleted (verified absent in secret store).

Week 7

Document the paved-path template.

  • Reusable workflow / shared action: standard OIDC-assume snippet with claim-condition examples
  • Terraform module: per-pipeline cloud-side trust setup, accepts repo + branch + role-scope as inputs
  • Developer-portal docs: "how to migrate your pipeline off static creds" with a 1-hour walkthrough
  • Anti-pattern callouts: what NOT to do (broad claim conditions · over-broad role policies · keeping a back-up static cred “just in case”)
Gate 7 · Paved-path template published + tested by 2nd team

A team that didn't do the wedge migrates a pipeline using only the docs. Time tracked; iterate on the docs.

Week 8

Migrate K8s workloads on the wedge cluster.

  • Enable IRSA / GCP Workload Identity / Azure WI on the cluster
  • Per service account: annotate to assume cloud role · remove static-credential mounts
  • Verify pods get credentials from the metadata service · audit log shows federated identity, not service-account key
Gate 8 · Cluster on workload identity, zero pod-mounted static creds

Cluster audit clean. Test pod with missing service-account assumption fails as expected.

Phase 3
Weeks 9–12

Roll across, retire vault usage.

Phase 3 propagates the pattern across the estate and retires the static-credential paths systematically. The vault stays — for things that genuinely need a secret store — but its role shrinks.

Week 9

Roll the pattern to 5 more pipelines, 2 more clusters.

  • Onboarding workshop: 2-hour session for engineering leads · live-migrate one pipeline per attendee
  • Office hours: weekly drop-in for engineers needing migration help
  • Track migration count + time-to-onboard. Iterate on docs as patterns emerge
Gate 9 · 5 pipelines + 2 clusters migrated

Critical mass. Pattern is now “how we do it”, not “the new thing”.

Week 10

Retire static-creds for SaaS integrations where the vendor supports OIDC.

  • Datadog · PagerDuty · Snowflake · Databricks · etc.: many now support cloud-native OIDC or short-lived tokens
  • Audit each integration: vendor doc says what's possible · migrate where supported
  • Leftover static-credentials: moved to a dedicated “legacy” vault namespace with quarterly rotation cadence and named owner
Gate 10 · Every SaaS-vendor static cred classified as “migrated” or “legacy-pinned”

Legacy list shrinks each quarter. New integrations default to OIDC where available.

Week 11

Detection — flag any new static credential creation.

  • Cloud-side: AWS Config rule / Azure Policy / GCP Org Policy that alerts on iam:CreateAccessKey · service-account-key creation
  • Source-side: PR linter that catches new AWS_ACCESS_KEY_ID patterns in code or YAML
  • Vault audit: log any new long-lived secret put. Owner must justify
Gate 11 · New-static-credential alarm live, ack'd weekly

Detection is the moat — without it the estate drifts back. Alarm reviewed by a named owner.

Week 12

Document the “before / after” for the audit story.

  • Per regulator-relevant area (APRA CPS 234 · NIST 800-53 IA-5 · E8 ML2): before-state credential count vs after-state · evidence of the rollout
  • Architecture diagram: updated to show federated-identity path · static-credential removal
  • External assurance / internal audit: review the work · sign off · file the evidence for next audit cycle
Gate 12 · Audit-ready evidence pack

If APRA / IRAP / SOC 2 auditor asks “show me your credential-management posture”, you produce a binder in <1 hour, not 1 week. Vault Theatre anti-pattern closed.

End of week 13.

Once workload identity is the default, the credential rotation question vanishes — there are no static credentials to rotate. The audit finding for APRA CPS 234 / NIST 800-53 / E8 ML2+ closes structurally, not procedurally.

Also on this site