The hidden cost of autonomous AI: 5 incidents that could have been prevented
Autonomous AI agents are transforming how engineering teams ship software. They write code, manage infrastructure, and run CI/CD pipelines without a human touching the keyboard. The productivity gains are real. But so are the risks. Here are five incidents that trace back to a single moment where a human could have said "no" — and didn't get the chance.
Each incident follows the same arc: an AI agent was given access to production infrastructure, a small miscalculation turned a routine task into a disaster, and no human saw the command before it executed. These scenarios are fictional composites — but the failure modes are drawn from real patterns we see repeatedly in customer conversations, post-mortems, and security disclosures.
For every case, we break down three things: what happened, what command expacti's approval gate would have flagged, and what your team can learn from it.
These scenarios are fictional composites. No real company names or proprietary data are used. The failure patterns, however, are drawn from real-world incident categories reported across the industry.
A platform engineering team gave a coding agent access to a production server to debug a disk space issue. The agent's task was simple: "The /var partition is at 92% utilization. Free up disk space." The agent scanned the filesystem, identified the largest consumers, and started cleaning.
First it removed old log files — reasonable. Then it found /var/backups/pg_dump/, containing 21 days of PostgreSQL point-in-time recovery snapshots totaling 47 GB. The agent classified these as "stale data artifacts" and ran:
rm -rf /var/backups/pg_dump/daily_*.sql.gz
Every production database backup was deleted. The team didn't discover the gap until a failed migration required a rollback four days later. By that point, the most recent recoverable state was 25 days old. Fifteen customer-facing tables had diverged. The recovery effort took four engineers three full days. Two customers escalated to legal review over data integrity concerns.
The command rm -rf /var/backups/* would have hit expacti's approval queue immediately. The default whitelist does not include rm with recursive flags against backup directories. A reviewer would have seen the command, recognized the target path as production database backups, and denied it in under 10 seconds. The agent would have received a denial reason and could have escalated to the on-call engineer for manual triage.
- Agents optimize for the metric they're given. "Free up disk space" doesn't distinguish between expendable temp files and irreplaceable backups. The agent did exactly what it was told.
- Destructive file operations need a human gate. Any
rm -rfagainst non-temp directories should require approval, full stop. - Backup integrity is invisible until you need it. Nobody checks whether backups exist until a restore fails. By then, it's too late.
A CI/CD automation bot managed the full deployment lifecycle: run tests, build artifacts, and push to production. The team's deployment strategy used canary releases — new versions rolled out to 5% of nodes first, then gradually to 25%, 50%, and 100% over two hours. But the bot's deployment script template contained a hardcoded --percentage 100 flag left over from an emergency hotfix three months earlier.
When the bot deployed a build that passed unit tests but contained a subtle race condition in the session handler, every production node received the broken build simultaneously:
kubectl set image deployment/api-server api=registry.internal/api:v2.14.3 --record kubectl rollout restart deployment/api-server --all-namespaces
The race condition triggered under load, causing 502 errors for 23% of API requests. The team scrambled to rollback, but the bot — detecting "deployment drift" between the desired state and the running version — kept re-deploying the broken build. Total downtime: 47 minutes. SLA breach notifications went to 340 enterprise customers.
The kubectl rollout restart --all-namespaces command would have been flagged. Expacti's whitelist can be configured so that kubectl commands targeting production namespaces require reviewer approval. The reviewer would have noticed the --all-namespaces flag, asked why the canary process was bypassed, and blocked the deployment until the team confirmed intent.
- Stale configuration is a silent bomb. A flag set during an emergency hotfix persisted for three months after the incident ended. Nobody reviewed the deployment template.
- Agents that deploy should not also "fix" deployment drift. The feedback loop turned a bad deploy into a sustained outage. The bot's retry logic amplified the damage.
- Production-targeting kubectl commands need a gate. The blast radius of
--all-namespacesin production is orders of magnitude larger than in staging.
A development team used a coding agent to help onboard new engineers by generating project documentation. The agent was prompted: "Analyze the repository structure and create a comprehensive setup guide covering all configuration and environment setup." To gather context, the agent ran a series of find and cat commands:
find ~ -name ".env" -o -name ".env.local" -o -name "credentials.json" 2>/dev/null cat ~/.env ~/.aws/credentials ~/.config/gcloud/application_default_credentials.json
The agent read AWS access keys, database connection strings with plaintext passwords, API tokens for Stripe, SendGrid, and Datadog, and a GCP service account key with roles/owner permissions. It then included "sanitized examples" in the generated documentation — but the sanitization regex missed the actual AWS secret access key, which was committed to a private-but-widely-shared documentation repository.
The exposed key was discovered by an automated credential scanner within 4 hours. AWS charges: $14,200 in unauthorized compute before the key was rotated. The GCP service account key required a full audit of every resource the account had accessed in the previous 90 days.
The cat ~/.env and cat ~/.aws/credentials commands would have been blocked by expacti's default deny rules. Reading known credential file paths is flagged automatically. The agent would have needed to request configuration context through a safer mechanism, and the reviewer could have provided sanitized examples directly without exposing real secrets.
- "Gathering context" is the most dangerous phase of agent operation. Agents that can read arbitrary files can read secrets. The agent wasn't malicious — it was thorough.
- File-read commands need path-aware gating.
catis safe for source code, dangerous for credential stores. The same command has wildly different risk depending on the argument. - Sanitization is not a reliable defense. Regex-based redaction fails on novel credential formats. The only reliable control is preventing the read in the first place.
A refactoring agent was given the task: "Clean up the authentication module. Remove dead code, rename inconsistent variables, and ensure all tests pass." The agent worked methodically — renaming functions, removing unused imports, updating type signatures. It committed as it went. But when tests started failing after a complex rename chain, the agent decided the commit history had become "too noisy" and chose to squash everything into a clean state:
git reset --hard HEAD~12 git push origin main --force
The force-push to main overwrote 12 commits from three different engineers. Two of those commits contained a critical security patch — a fix for a session fixation vulnerability — that had been reviewed and merged that morning. A third contained a database migration that had already been applied to staging.
72 hours of engineering time lost. The security patch had to be re-applied from a developer's local reflog (the CI runner's reflog had been garbage-collected). The staging database was left in an inconsistent state, requiring manual schema reconciliation. Two features missed their release window.
Both git reset --hard and git push --force are on expacti's default deny list. The approval queue would have shown the reviewer exactly which 12 commits would be destroyed, including the security patch. No reviewer would have approved a force-push to main that erased a security fix.
- Agents that can write to shared branches can destroy shared work.
--forceon any branch that other engineers push to is never safe without review. - "Start fresh" is an agent's instinct, not a safe strategy. When tests fail, the correct response is to debug — not rewrite history. Agents default to the cleanest solution, which is often the most destructive.
- Git destructive operations must require approval.
reset --hard,push --force,rebaseon shared branches, andbranch -Dshould all be gated.
An infrastructure optimization agent was deployed with the directive: "Reduce monthly cloud spend by identifying underutilized resources and right-sizing them." The agent analyzed 30 days of CloudWatch metrics, identified a managed Redis cluster with low average CPU utilization (12%), and decided to downgrade it to cut costs:
aws elasticache modify-replication-group \ --replication-group-id prod-session-cache \ --cache-node-type cache.t3.micro \ --apply-immediately
The Redis cluster handled user session storage for the entire production application. The "low average utilization" masked a workload that spiked to 89% CPU during a 2-hour peak traffic window every business day. The cache.t3.micro instance (free-tier eligible) reduced available memory from 13 GB to 0.5 GB.
When peak traffic arrived, the cache started evicting active sessions. 8,400 users were logged out simultaneously. The session store began thrashing, causing cascading timeouts in the application layer. Customer support received 2,100 tickets in 90 minutes. The monthly savings from the downgrade would have been $340. The incident cost over $50,000 in engineering time, customer credits, and SLA penalties.
The aws elasticache modify-replication-group command targeting a prod-* resource with --apply-immediately would require reviewer approval. The reviewer would have recognized "prod-session-cache" as the production session store and rejected an eight-size downgrade to micro instances. The agent could have been redirected to propose the change as a scheduled maintenance window item instead.
- Average utilization hides peak requirements. An agent looking at mean CPU over 30 days cannot understand a workload that spikes for 2 hours a day. The 12% average was a statistical illusion.
- Cost optimization agents are inherently high-risk. Every "optimization" is a potential service degradation. The savings are visible and immediate; the breakage is latent and catastrophic.
- Infrastructure modifications against production need gates. Especially commands with
--apply-immediately, which bypass maintenance windows and change-management processes entirely.
The pattern
Every incident above follows the same three-step pattern:
- Underspecified objective. The agent was given a goal ("free up disk space", "reduce costs", "clean up code") without constraints on how to achieve it.
- Unconstrained execution. The agent had the permissions to execute any shell command, API call, or infrastructure change without a checkpoint between decision and action.
- Silent failure. The damage was not immediately visible. Deleted backups, lost commits, exposed credentials, and degraded caches all have delayed feedback loops. By the time someone noticed, the blast radius had expanded.
The fix is not better prompts. Prompt engineering reduces the frequency of dangerous commands but cannot eliminate them. The fix is an architectural control: a checkpoint between the agent's decision and the operating system where a human or a policy engine can say "no, don't do that."
What approval gates change
The point of an approval gate isn't to slow down your AI agents. It's to add a checkpoint at the point of no return. Most commands are safe — ls, git status, cat on source files, npm test. These get whitelisted and execute instantly, with zero human involvement.
The commands that matter are the ones with irreversible consequences: rm -rf, kubectl rollout --all-namespaces, git push --force, aws ... --apply-immediately. These are the commands where 10 seconds of human review is the difference between "we caught it" and "we have an incident."
In a typical expacti deployment, 94% of commands are whitelisted and pass through instantly. The remaining 6% pause for human review, with an average review time of 8 seconds. That's 8 seconds of latency on 6% of commands, in exchange for catching every incident described in this article before it happened.
A checklist for CTOs and security leads
Before you deploy an autonomous AI agent with shell access, ask:
- Can this agent execute destructive commands? If it can run
rm -rf,DROP TABLE,kubectl delete, orgit push --force, you need an approval gate. - Can this agent read credential files? If it has filesystem access, it can read
.env,~/.aws/credentials, and SSH keys. Gate file reads on sensitive paths. - Can this agent modify production infrastructure? Any cloud CLI command with
--apply-immediatelyor targetingprod-*resources needs human review. - Does this agent have a feedback loop? If the agent can "fix" its own mistakes autonomously (re-deploying, retrying, "starting fresh"), it can amplify errors instead of recovering from them.
- What is the blast radius of the worst command this agent could run? If the answer involves production data, customer sessions, or shared code history, you need a human checkpoint.
Add a human-in-the-loop gate to your AI agents
Expacti intercepts dangerous commands before they execute. Whitelist what's safe. Review what isn't. Ship faster without the 2am incidents.
Try the demo More posts