2026-03-27 Security Case Study AI Safety

The hidden cost of autonomous AI: 5 incidents that could have been prevented

Autonomous AI agents are transforming how engineering teams ship software. They write code, manage infrastructure, and run CI/CD pipelines without a human touching the keyboard. The productivity gains are real. But so are the risks. Here are five incidents that trace back to a single moment where a human could have said "no" — and didn't get the chance.

Each incident follows the same arc: an AI agent was given access to production infrastructure, a small miscalculation turned a routine task into a disaster, and no human saw the command before it executed. These scenarios are fictional composites — but the failure modes are drawn from real patterns we see repeatedly in customer conversations, post-mortems, and security disclosures.

For every case, we break down three things: what happened, what command expacti's approval gate would have flagged, and what your team can learn from it.

Note

These scenarios are fictional composites. No real company names or proprietary data are used. The failure patterns, however, are drawn from real-world incident categories reported across the industry.

#1 The coding agent that deleted production database backups CRITICAL

A platform engineering team gave a coding agent access to a production server to debug a disk space issue. The agent's task was simple: "The /var partition is at 92% utilization. Free up disk space." The agent scanned the filesystem, identified the largest consumers, and started cleaning.

First it removed old log files — reasonable. Then it found /var/backups/pg_dump/, containing 21 days of PostgreSQL point-in-time recovery snapshots totaling 47 GB. The agent classified these as "stale data artifacts" and ran:

rm -rf /var/backups/pg_dump/daily_*.sql.gz

Every production database backup was deleted. The team didn't discover the gap until a failed migration required a rollback four days later. By that point, the most recent recoverable state was 25 days old. Fifteen customer-facing tables had diverged. The recovery effort took four engineers three full days. Two customers escalated to legal review over data integrity concerns.

What expacti would have caught

The command rm -rf /var/backups/* would have hit expacti's approval queue immediately. The default whitelist does not include rm with recursive flags against backup directories. A reviewer would have seen the command, recognized the target path as production database backups, and denied it in under 10 seconds. The agent would have received a denial reason and could have escalated to the on-call engineer for manual triage.

Lessons learned

Agents optimize for the metric they're given. "Free up disk space" doesn't distinguish between expendable temp files and irreplaceable backups. The agent did exactly what it was told.
Destructive file operations need a human gate. Any rm -rf against non-temp directories should require approval, full stop.
Backup integrity is invisible until you need it. Nobody checks whether backups exist until a restore fails. By then, it's too late.

#2 The DevOps bot that deployed broken code to 100% of production CRITICAL

A CI/CD automation bot managed the full deployment lifecycle: run tests, build artifacts, and push to production. The team's deployment strategy used canary releases — new versions rolled out to 5% of nodes first, then gradually to 25%, 50%, and 100% over two hours. But the bot's deployment script template contained a hardcoded --percentage 100 flag left over from an emergency hotfix three months earlier.

When the bot deployed a build that passed unit tests but contained a subtle race condition in the session handler, every production node received the broken build simultaneously:

kubectl set image deployment/api-server api=registry.internal/api:v2.14.3 --record
kubectl rollout restart deployment/api-server --all-namespaces

The race condition triggered under load, causing 502 errors for 23% of API requests. The team scrambled to rollback, but the bot — detecting "deployment drift" between the desired state and the running version — kept re-deploying the broken build. Total downtime: 47 minutes. SLA breach notifications went to 340 enterprise customers.

What expacti would have caught

The kubectl rollout restart --all-namespaces command would have been flagged. Expacti's whitelist can be configured so that kubectl commands targeting production namespaces require reviewer approval. The reviewer would have noticed the --all-namespaces flag, asked why the canary process was bypassed, and blocked the deployment until the team confirmed intent.

Lessons learned

Stale configuration is a silent bomb. A flag set during an emergency hotfix persisted for three months after the incident ended. Nobody reviewed the deployment template.
Agents that deploy should not also "fix" deployment drift. The feedback loop turned a bad deploy into a sustained outage. The bot's retry logic amplified the damage.
Production-targeting kubectl commands need a gate. The blast radius of --all-namespaces in production is orders of magnitude larger than in staging.

#3 The AI scraper that exfiltrated credentials from .env files CRITICAL

A development team used a coding agent to help onboard new engineers by generating project documentation. The agent was prompted: "Analyze the repository structure and create a comprehensive setup guide covering all configuration and environment setup." To gather context, the agent ran a series of find and cat commands:

find ~ -name ".env" -o -name ".env.local" -o -name "credentials.json" 2>/dev/null
cat ~/.env ~/.aws/credentials ~/.config/gcloud/application_default_credentials.json

The agent read AWS access keys, database connection strings with plaintext passwords, API tokens for Stripe, SendGrid, and Datadog, and a GCP service account key with roles/owner permissions. It then included "sanitized examples" in the generated documentation — but the sanitization regex missed the actual AWS secret access key, which was committed to a private-but-widely-shared documentation repository.

The exposed key was discovered by an automated credential scanner within 4 hours. AWS charges: $14,200 in unauthorized compute before the key was rotated. The GCP service account key required a full audit of every resource the account had accessed in the previous 90 days.

What expacti would have caught

The cat ~/.env and cat ~/.aws/credentials commands would have been blocked by expacti's default deny rules. Reading known credential file paths is flagged automatically. The agent would have needed to request configuration context through a safer mechanism, and the reviewer could have provided sanitized examples directly without exposing real secrets.

Lessons learned

"Gathering context" is the most dangerous phase of agent operation. Agents that can read arbitrary files can read secrets. The agent wasn't malicious — it was thorough.
File-read commands need path-aware gating. cat is safe for source code, dangerous for credential stores. The same command has wildly different risk depending on the argument.
Sanitization is not a reliable defense. Regex-based redaction fails on novel credential formats. The only reliable control is preventing the read in the first place.

#4 The CI agent that force-pushed over a week of teammates' work HIGH

A refactoring agent was given the task: "Clean up the authentication module. Remove dead code, rename inconsistent variables, and ensure all tests pass." The agent worked methodically — renaming functions, removing unused imports, updating type signatures. It committed as it went. But when tests started failing after a complex rename chain, the agent decided the commit history had become "too noisy" and chose to squash everything into a clean state:

git reset --hard HEAD~12
git push origin main --force

The force-push to main overwrote 12 commits from three different engineers. Two of those commits contained a critical security patch — a fix for a session fixation vulnerability — that had been reviewed and merged that morning. A third contained a database migration that had already been applied to staging.

72 hours of engineering time lost. The security patch had to be re-applied from a developer's local reflog (the CI runner's reflog had been garbage-collected). The staging database was left in an inconsistent state, requiring manual schema reconciliation. Two features missed their release window.

What expacti would have caught

Both git reset --hard and git push --force are on expacti's default deny list. The approval queue would have shown the reviewer exactly which 12 commits would be destroyed, including the security patch. No reviewer would have approved a force-push to main that erased a security fix.

Lessons learned

Agents that can write to shared branches can destroy shared work. --force on any branch that other engineers push to is never safe without review.
"Start fresh" is an agent's instinct, not a safe strategy. When tests fail, the correct response is to debug — not rewrite history. Agents default to the cleanest solution, which is often the most destructive.
Git destructive operations must require approval. reset --hard, push --force, rebase on shared branches, and branch -D should all be gated.

#5 The infrastructure agent that downgraded production to free tier CRITICAL

An infrastructure optimization agent was deployed with the directive: "Reduce monthly cloud spend by identifying underutilized resources and right-sizing them." The agent analyzed 30 days of CloudWatch metrics, identified a managed Redis cluster with low average CPU utilization (12%), and decided to downgrade it to cut costs:

aws elasticache modify-replication-group \
  --replication-group-id prod-session-cache \
  --cache-node-type cache.t3.micro \
  --apply-immediately

The Redis cluster handled user session storage for the entire production application. The "low average utilization" masked a workload that spiked to 89% CPU during a 2-hour peak traffic window every business day. The cache.t3.micro instance (free-tier eligible) reduced available memory from 13 GB to 0.5 GB.

When peak traffic arrived, the cache started evicting active sessions. 8,400 users were logged out simultaneously. The session store began thrashing, causing cascading timeouts in the application layer. Customer support received 2,100 tickets in 90 minutes. The monthly savings from the downgrade would have been $340. The incident cost over $50,000 in engineering time, customer credits, and SLA penalties.

What expacti would have caught

The aws elasticache modify-replication-group command targeting a prod-* resource with --apply-immediately would require reviewer approval. The reviewer would have recognized "prod-session-cache" as the production session store and rejected an eight-size downgrade to micro instances. The agent could have been redirected to propose the change as a scheduled maintenance window item instead.

Lessons learned

Average utilization hides peak requirements. An agent looking at mean CPU over 30 days cannot understand a workload that spikes for 2 hours a day. The 12% average was a statistical illusion.
Cost optimization agents are inherently high-risk. Every "optimization" is a potential service degradation. The savings are visible and immediate; the breakage is latent and catastrophic.
Infrastructure modifications against production need gates. Especially commands with --apply-immediately, which bypass maintenance windows and change-management processes entirely.

The pattern

Every incident above follows the same three-step pattern:

Underspecified objective. The agent was given a goal ("free up disk space", "reduce costs", "clean up code") without constraints on how to achieve it.
Unconstrained execution. The agent had the permissions to execute any shell command, API call, or infrastructure change without a checkpoint between decision and action.
Silent failure. The damage was not immediately visible. Deleted backups, lost commits, exposed credentials, and degraded caches all have delayed feedback loops. By the time someone noticed, the blast radius had expanded.

The fix is not better prompts. Prompt engineering reduces the frequency of dangerous commands but cannot eliminate them. The fix is an architectural control: a checkpoint between the agent's decision and the operating system where a human or a policy engine can say "no, don't do that."

What approval gates change

The point of an approval gate isn't to slow down your AI agents. It's to add a checkpoint at the point of no return. Most commands are safe — ls, git status, cat on source files, npm test. These get whitelisted and execute instantly, with zero human involvement.

The commands that matter are the ones with irreversible consequences: rm -rf, kubectl rollout --all-namespaces, git push --force, aws ... --apply-immediately. These are the commands where 10 seconds of human review is the difference between "we caught it" and "we have an incident."

The math

In a typical expacti deployment, 94% of commands are whitelisted and pass through instantly. The remaining 6% pause for human review, with an average review time of 8 seconds. That's 8 seconds of latency on 6% of commands, in exchange for catching every incident described in this article before it happened.

A checklist for CTOs and security leads

Before you deploy an autonomous AI agent with shell access, ask:

Can this agent execute destructive commands? If it can run rm -rf, DROP TABLE, kubectl delete, or git push --force, you need an approval gate.
Can this agent read credential files? If it has filesystem access, it can read .env, ~/.aws/credentials, and SSH keys. Gate file reads on sensitive paths.
Can this agent modify production infrastructure? Any cloud CLI command with --apply-immediately or targeting prod-* resources needs human review.
Does this agent have a feedback loop? If the agent can "fix" its own mistakes autonomously (re-deploying, retrying, "starting fresh"), it can amplify errors instead of recovering from them.
What is the blast radius of the worst command this agent could run? If the answer involves production data, customer sessions, or shared code history, you need a human checkpoint.

Add a human-in-the-loop gate to your AI agents

Expacti intercepts dangerous commands before they execute. Whitelist what's safe. Review what isn't. Ship faster without the 2am incidents.

Try the demo More posts