2026-03-25 AI Security

Why every AI agent command needs explicit human approval

AI coding agents are executing shell commands at machine speed. Here's why "just review the logs afterward" is the wrong mental model — and what to do instead.

The new reality: agents that act

A year ago, AI assistants mostly wrote text. Today they run code, modify files, restart services, and interact with databases — all autonomously, in your production environment.

This is genuinely useful. An agent that can scaffold a project, run migrations, and deploy a fix is orders of magnitude more productive than one that just suggests code. But that capability cuts both ways.

"The question isn't whether AI agents should be able to run commands. It's whether they should run them without you seeing it happen."

Most teams handle this by reviewing logs afterward. That feels reasonable until something goes wrong — and by then, the damage is done.

The "review logs after" fallacy

Post-hoc review has a fundamental problem: it's not actually a control. It's forensics. You're not preventing bad things from happening; you're finding out about them after the fact.

Consider the blast radius of a few common agent mistakes:

rm -rf /old-data — wrong directory, data gone
DROP TABLE users; — confused environment, prod instead of staging
git push --force origin main — overwrites a week of work
curl ... | bash — arbitrary remote code execution

These aren't hypothetical. They're the exact category of commands that AI coding agents propose and execute regularly. The agent isn't malicious — it's just operating with incomplete context, or the user approved something without fully reading it.

What "human-in-the-loop" actually means

The phrase gets used loosely. People call a system "human-in-the-loop" if a human initiated the agent run, or if they can review what happened. That's not enough.

Real human-in-the-loop means the human is a gate, not a spectator. Specifically:

The agent proposes an action
Execution is suspended until a human explicitly approves
The human can deny, and the agent must handle that gracefully
Everything is recorded in an immutable audit trail

This isn't novel — it's how nuclear launch procedures, surgical checklists, and financial compliance work. High-stakes systems require explicit human authorization at the point of action, not before or after.

The whitelist problem

One common objection: "We'd whitelist safe commands and only require approval for dangerous ones. Shouldn't ls be allowed automatically?"

Yes — and this is exactly the right instinct. A good human-in-the-loop system has a whitelist engine. But building that whitelist correctly is harder than it looks:

The whitelist challenge

ls /tmp — safe. ls /root/.ssh — maybe not.
cat app.log — fine. cat /etc/shadow — very much not.
docker ps — harmless. docker exec -it prod bash — opens a shell in production.

Context matters. A flat whitelist of "safe commands" misses this entirely.

The right approach is a whitelist that evolves with use: start strict, auto-approve commands that have been reviewed multiple times without issue, use risk scoring to triage, and surface AI-suggested pattern rules for the reviewer to approve in bulk.

What the reviewer UX needs to be

If approval is a gate, the gate can't be annoying. Every second of latency is a second the agent is blocked. Reviewer UX matters enormously:

The reviewer sees the exact command, with context (working directory, session, risk score)
Approval is one keystroke (a = approve, d = deny)
Mobile-friendly, so reviewers aren't chained to their desk
Notifications via push, Slack, Teams — wherever reviewers actually are
Auto-deny on timeout, with configurable policy (don't silently approve when no one is watching)

When the approval UX is this lightweight, the median review time drops to 5–15 seconds. That's acceptable for most agent workflows.

The audit trail as compliance artifact

There's a second-order benefit that matters for engineering organizations: an immutable audit trail of every command, who approved it, when, and whether it ran.

This isn't just forensics. It's a compliance artifact that maps directly to SOC 2 CC6.x controls (access management, change management), ISO 27001 Annex A.9 and A.12.4, and emerging AI governance requirements.

When an auditor asks "who authorized this database migration," the answer shouldn't be "we think the Terraform plan was approved in a Slack thread." It should be a signed, timestamped record with reviewer identity and the exact command.

Practical integration

The good news: instrumenting an existing agent workflow is a few lines of code. With the expacti Python SDK:

from expacti import ExpactiClient

client = ExpactiClient(
    backend_url="https://api.expacti.com",
    shell_token="your-token",
)

# This blocks until a reviewer approves or denies
result = client.run("rm -rf /old-logs")

if result.allowed:
    print("Approved — executing")
else:
    print(f"Denied: {result.reason}")

For LangChain agents, drop in the ExpactiTool:

from expacti.langchain_tool import ExpactiTool

tools = [
    ExpactiTool(backend_url="https://api.expacti.com", shell_token="..."),
    # ... other tools
]
agent = create_react_agent(llm, tools, prompt)

Every command the agent proposes now goes through the approval queue before executing. The agent handles denials gracefully — it can retry with a modified command, or report back to the user that the action was blocked.

The right model for 2026

We're at an inflection point. AI agents are moving from "writes code for you" to "runs code for you." The security model hasn't caught up.

The teams that get this right will treat AI agent actions the same way they treat infrastructure changes: with explicit authorization, audit trails, and rollback capability. Not because they don't trust the AI, but because that's just how you run production systems responsibly.

Post-hoc log review was always a compromise. It made sense when the cost of a pre-execution gate was too high. Now that the tooling exists to make approval instant and low-friction, there's no reason to accept the risk.

Try expacti

Human-in-the-loop approval for every command your AI agents run. Free to start, no infrastructure required.

Get started for free →