Shift-left security: why runtime approval beats pre-flight checks
9 min read
The DevSecOps movement had a simple, correct insight: find security problems earlier. Move the security check closer to the developer, closer to the code, closer to the commit. Don't wait for production — catch it in CI. Don't wait for CI — catch it in the IDE.
So we built OPA policies. We adopted Kyverno for Kubernetes admission control. We ran Checkov and tfsec in our pipelines. We linted Dockerfiles. We scanned container images before push. The shift-left philosophy worked beautifully — for code.
Then AI agents arrived, and they broke every assumption shift-left security makes.
What shift-left actually assumes
Policy-as-code tools work because of a fundamental property of the things they check: the behavior is known before execution.
A Terraform plan is a static artifact. A Kubernetes manifest is a static artifact. A Dockerfile, a Helm chart, a GitHub Actions workflow — all static. You can inspect them, reason about them, run policy checks against them, and catch dangerous configurations before anything runs.
The entire shift-left model depends on this. You can check a terraform plan
output against an OPA policy because the plan exists as a document. You can admit or
reject a Kubernetes deployment because the manifest exists before the pods start.
Static analysis works on static things. The shift-left stack was built for a world where humans write the things that run.
AI agents don't work that way
An AI coding agent doesn't produce a plan document you can inspect. It produces shell commands in real time, in response to live context — the current state of the codebase, the contents of a file it just read, the output of a command it just ran, the instruction it received three steps ago.
The command that an agent will run in 10 seconds doesn't exist yet. It's being generated right now, based on information that didn't exist when your policy was written.
Consider a real example:
# The agent is fixing a bug in a user migration script.
# It's run these commands without incident:
git diff HEAD~1 migrations/
cat migrations/0042_add_user_roles.sql
psql $DATABASE_URL -c "SELECT count(*) FROM users WHERE role IS NULL"
# Then the agent decides the fix is to backfill the column.
# It generates this command next:
psql $DATABASE_URL -c "UPDATE users SET role='viewer' WHERE role IS NULL"
That UPDATE touches every row where role IS NULL.
In staging, that's 3 rows. In production, it might be 2 million.
No static analysis tool could have caught this. The command was generated
dynamically. Its danger is a function of runtime state — the size of the
users table — which no pre-flight check can know.
The three blind spots of shift-left for AI agents
1. Context-dependent risk
The same command can be safe or catastrophic depending on runtime context.
rm -rf /tmp/build-* is harmless cleanup in a CI container.
The same command is a disaster if /tmp is symlinked to production storage.
OPA can check whether rm -rf appears in a Dockerfile or a workflow YAML.
It cannot check whether the path argument resolves safely in the environment where it runs.
2. Prompt injection changes agent behavior at runtime
An agent reading a file, processing an API response, or summarizing database content may encounter adversarial input designed to change its behavior. A malicious README, a crafted commit message, a poisoned environment variable — any of these can redirect a capable agent toward actions the developer never intended.
Static policy tools check the agent's initial configuration. They don't check what the agent has read in the last five minutes.
3. Emergent behavior from multi-step reasoning
Agents reason across multiple steps, and safe individual steps can compose into dangerous aggregate behavior. An agent might:
- Read a config file — innocuous
- Identify an environment variable with a path — innocuous
- Run
lson that path — innocuous - Decide to clean up stale files at that path — dangerous if the path is wrong
Each step passes any reasonable policy check. The combination doesn't.
Where shift-left is still valuable
None of this means you should throw away OPA or Kyverno. Shift-left tools are still extremely valuable for the things they were designed to check:
| Use case | Best tool | Why |
|---|---|---|
| Kubernetes admission control | OPA / Kyverno | Manifests are static artifacts |
| Terraform plan review | OPA / Checkov | Plan output is static, inspectable |
| Container image scanning | Trivy / Grype | Image contents are known before pull |
| CI/CD workflow gates | OPA / Conftest | Workflow YAML is static |
| Agent-generated shell commands | Runtime approval | Commands don't exist until runtime |
The pattern is clear: shift-left works on artifacts that exist before execution. For AI agent commands — which are generated dynamically at runtime — you need something different.
What runtime approval actually provides
Runtime approval isn't a replacement for shift-left. It's a different layer, addressing a different threat surface. The core properties:
A human sees what's about to happen
When an agent generates UPDATE users SET role='viewer' WHERE role IS NULL,
a runtime approval gate intercepts it before execution and surfaces it to a reviewer.
The reviewer sees the actual command, in context, at the moment it would run.
Not a policy check against a YAML rule. Not a scan of the agent's configuration. The actual command, with the actual runtime state visible, reviewed by a human.
Whitelist accumulates from real usage
Every approved command contributes to a per-org whitelist. Commands the agent
has run safely many times before — git status, cargo test -p expacti-backend,
docker ps --format '{{.Names}}' — pass through instantly without interrupting flow.
The approval overhead concentrates exactly where it should: on novel, potentially dangerous commands. Routine operations stay frictionless.
Risk scoring surfaces the dangerous ones
Not every novel command deserves the same attention. A command the agent has never
run before that touches /etc/passwd is different from a new
grep invocation. Risk scoring — based on command category, destructiveness
indicators, privilege escalation markers — helps reviewers prioritize.
Shift-left catches known-bad configurations. Runtime approval catches unknown-dangerous commands. You need both.
The "policy-as-code is enough" objection
The most common pushback from DevSecOps teams is: "we can write OPA policies
that cover dangerous commands. deny if the command matches certain patterns."
This sounds reasonable. It fails in practice for two reasons.
First, pattern-matching on shell commands is fundamentally brittle. Commands are
strings. rm -rf /var/log and rm -rf /var/log (extra spaces)
are the same operation. Bash variable expansion, subshells, and aliases make the
problem worse. Agents that know your patterns can generate functionally equivalent
commands that bypass them.
Second, the threat model for AI agents is different. You're not trying to prevent a known class of attacks by a known class of adversaries. You're trying to maintain human understanding and oversight of what a capable, well-intentioned but fallible system is doing on your behalf. Policy-as-code was designed for the former. Runtime approval is designed for the latter.
The real question isn't "is this command in my deny list?" It's "does a human who understands the current context agree this is the right command right now?" Static policies can't answer that.
Practical integration: both layers together
The right answer isn't to choose. Layer them:
- Shift-left for infrastructure: OPA policies gate Kubernetes manifests, Terraform plans, and container images before they reach your environment. Kyverno enforces resource limits, network policies, and image registry restrictions. Checkov catches misconfigurations in IaC before apply.
- Runtime approval for agent actions: Every command an AI agent generates goes through a human approval gate before execution. Known-safe commands (whitelist) pass through automatically. Novel or high-risk commands require explicit reviewer sign-off.
These layers address different threat surfaces and don't overlap in a meaningful way. Adding runtime approval doesn't invalidate your OPA investment. It covers the gap that OPA explicitly cannot cover.
Example: agent running a database migration
# Your CI already has OPA policies that check:
# - Terraform plans for dangerous IAM changes
# - K8s manifests for privileged container requests
# - Dockerfile for root user usage
# Your agent is executing the migration:
$ expacti run -- psql $DATABASE_URL -f migrations/0043_add_indexes.sql
# → Approved (whitelisted pattern, run 12 times before)
$ expacti run -- psql $DATABASE_URL -c "UPDATE users SET ..."
# → PENDING APPROVAL
# Reviewer sees: UPDATE affecting users table, no WHERE clause on primary key
# Reviewer: DENY — missing WHERE filter, could touch all rows
# Agent: command blocked, reviewer note surfaced back to agent
$ expacti run -- psql $DATABASE_URL -c "UPDATE users SET role='viewer' WHERE role IS NULL AND created_at > '2026-01-01'"
# → PENDING APPROVAL
# Reviewer sees: narrower UPDATE, date filter, estimated 1,247 rows
# Reviewer: APPROVE
OPA wouldn't have caught the missing WHERE filter — the command
is syntactically valid and doesn't match any malicious pattern. The runtime
reviewer caught it immediately.
The shift-left metaphor is right, but incomplete
"Shift left" means moving security earlier in the development lifecycle. That's still exactly right. But the development lifecycle now includes AI agents executing commands at runtime — and there's no further left to shift a command that didn't exist until the agent generated it.
Runtime approval is what happens when you've shifted everything as far left as it can go, and you still need a human in the loop for the last mile.
The goal is the same: a human who understands the intent and the risk signs off before anything irreversible happens. Shift-left achieves that for static artifacts. Runtime approval achieves it for dynamic agent actions. Together, they close the loop.
Runtime approval for your AI agents
See how expacti fits into your existing DevSecOps stack — alongside OPA, Kyverno, and your CI pipeline.
Try the interactive demo →