№ 028May 2026

AI Agent Access Control After the 9-Second Database Wipe.

An AI agent deleted a production database in nine seconds. The fix is not better prompts. It is access control the agent cannot override. Here is how.

Nine seconds, one curl, no backups

On April 25, 2026, a Cursor agent running Claude Opus 4.6 deleted the entire production database for an automotive SaaS called PocketOS. The agent was working in staging, hit a credential mismatch, and decided to "resolve" the inconsistency by issuing a single API call to Railway. The Register reported the action took nine seconds. The volume held production data and volume-level backups in the same place, so both went together. Railway's CEO restored the data the following Sunday by recovering the underlying block storage.

The agent had access to a Railway API token that was meant for managing custom domains. The token had not been scoped down, so it carried authority for any operation, including deletions. The agent did not "go rogue" in any interesting sense. It found a token, mapped a problem to a tool call, and ran the call.

This post is not a victory lap on prompt engineering. The fix is not in the prompt. The fix is in the access control surface that the agent operates against. If your production environment can be destroyed by a single token that the agent can find and use, the failure was already there before the agent showed up.

The blame conversation is the wrong conversation

The most-shared response to the PocketOS incident, Ibrahima Diallo's "AI didn't delete your database, you did", makes the right point. Having "a public-facing API that can delete all your production databases" is the design defect. The agent is the accelerant, not the spark.

Teams that adopt AI agents safely in regulated environments do not start with model selection or prompt design. They start by asking which credentials live in the agent's process, what those credentials can do at the cloud or platform layer, and what the recovery path looks like if the agent does the worst plausible thing in the smallest plausible window. The nine-second window is what happens when an agent has high-blast-radius credentials and the freedom to use them without a second checker.

Where the boundary actually lives

A workable mental model: an agent has three layers of authority.

The model layer. What the LLM "wants" to do. Influenced by the prompt, the system prompt, the tools you describe, the data it sees in context.
The tool layer. The functions you expose to the model and the policy your code enforces inside those functions.
The credential layer. What the underlying tokens, IAM roles, and database roles actually permit at the API of the downstream system.

Prompt-level rules ("never run destructive commands without permission") sit at layer one. Layer one is the weakest. It is influenced by anything in context, including content the agent reads from a ticket, a file, a database row, or a returned API response. Treating layer one as a security control is what gets you a nine-second wipe.

Layer two is your code. You decide whether a tool called delete_volume even exists, what arguments it accepts, what guardrails fire before the underlying call.

Layer three is the cloud provider. AWS IAM policies, Postgres role grants, and platform-specific scoped tokens are the only layer that holds when everything above it is compromised by prompt injection or by the agent making an honest but catastrophic decision.

Ship layers two and three before you ship layer one.

Pattern 1: Scope the credential, not the prompt

If you are giving an agent a Railway, AWS, or platform token, that token should do exactly what you need and nothing else. For AWS, this is the standard least-privilege IAM practice you already follow for human users, applied to the agent's role. Use specific actions, specific resource ARNs, condition keys that pin region and account, and explicit denies for destructive operations the agent has no reason to ever call.

A read-mostly agent looks something like this:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ReadOnlyForAgent",
      "Effect": "Allow",
      "Action": [
        "rds:Describe*",
        "logs:GetLogEvents",
        "cloudwatch:GetMetricData"
      ],
      "Resource": "*",
      "Condition": {
        "StringEquals": { "aws:RequestedRegion": "us-east-1" }
      }
    },
    {
      "Sid": "DenyDestructive",
      "Effect": "Deny",
      "Action": [
        "rds:Delete*",
        "rds:Modify*",
        "ec2:TerminateInstances",
        "s3:DeleteBucket",
        "kms:ScheduleKeyDeletion"
      ],
      "Resource": "*"
    }
  ]
}

The explicit Deny matters. AWS evaluates explicit denies before allows (policy evaluation logic), so even a wildcard escalation in another statement cannot override it. For platform tokens that lack this granularity, your job is harder. Either keep the agent away from tokens with destructive scope, or wrap the platform's API behind your own policy layer.

For Postgres, this is the moment to actually use role separation. The agent connects as a role with SELECT on a specific schema. A separate, explicitly named role holds INSERT, UPDATE, DELETE. A third role can run schema migrations. The agent never has more than one role at a time, and the destructive role is unreachable from the agent's process.

Pattern 2: Read by default, write through approval

The tool layer is where you implement two-key authority. Define your tools so that any operation with non-trivial blast radius returns "needs approval" instead of executing.

The Anthropic tool use API gives you the shape directly. Your application drives the loop, so a tool can decline to execute and return a tool_result that puts the work into a queue. A simplified pattern in TypeScript:

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

const tools = [
  {
    name: 'delete_records',
    description: 'Delete rows from the customers table.',
    input_schema: {
      type: 'object',
      properties: {
        ids: { type: 'array', items: { type: 'string' } },
        reason: { type: 'string' }
      },
      required: ['ids', 'reason']
    }
  }
];

async function handleToolCall(name: string, input: any, ctx: Ctx) {
  if (name === 'delete_records') {
    if (input.ids.length > 1 || !ctx.approver) {
      const approvalId = await enqueueApproval({
        tool: name,
        input,
        requester: ctx.agentId,
        sla: '1h'
      });
      return {
        status: 'pending_approval',
        approval_id: approvalId,
        message: 'Destructive operation queued. Awaiting approver.'
      };
    }
    return await db.deleteRecords(input.ids, ctx.approver);
  }
  // other tools
}

Two things matter. First, the tool returns a structured tool_result that the model can read, so the agent can plan around the queued operation. Second, the destructive call only runs when an external approver, identified separately from the agent context, has signed off. The approval gate is enforced by the function, not by the prompt.

If you already run a change-management system, plug into it. The agent submits a ticket, a named human approves, the tool layer reads the approval and executes. You get audit, identity, and revocation for free.

Pattern 3: Backup separation the agent cannot reach

The PocketOS recovery worked because Railway had block-level snapshots their CEO could restore. If a destructive action also takes out the recovery path, you do not have backups, you have a ceremony.

For AWS workloads, AWS Backup vault locks provide write-once-read-many retention that even a root user cannot remove during the cooling-off period. Set up:

Backups land in a separate AWS account with no trust relationship to the agent's account.
The destination vault is locked in compliance mode with a minimum retention period.
The KMS key for the backup vault is in the destination account, with a key policy that does not permit kms:ScheduleKeyDeletion from the source account.

This is the part that a misaligned agent cannot undo from inside the production account. Even if the agent escalates to admin in production and runs DeleteDBInstance everywhere, the backups in the locked vault survive.

For non-AWS platforms, the principle is the same: backups live in a separate trust boundary, with retention that is set by policy outside the system the agent operates in. If your platform does not support that, treat it as a deployment risk, not a configuration question.

Pattern 4: Two agents, two scopes

A useful pattern we recommend for higher-risk integrations is the planner-executor split. The planner runs with broad read access. It can query, summarize, and propose actions. It cannot execute. The executor is a smaller, narrower agent or service that takes structured action requests, validates them against policy, and runs them with the minimum credential needed for that specific action.

This works because the planner sees untrusted content, like prompts, tickets, web pages, and external documents, while the executor only ever sees structured input from the planner. Prompt injection in a ticket can confuse the planner. It cannot reach the executor's credentials directly, because the executor accepts a typed schema, not free text.

Combine this with scoped MCP servers and the credential layer never sees a token wider than the action requires.

A pre-deployment access checklist

Before any agent touches production, work through this list:

List every credential reachable from the agent's process. Token, role, kubeconfig, mounted secret. If you cannot list them, the agent is not ready.
For each credential, write the worst single command it can execute. If any of those commands ends a sentence with "and there is no recovery," fix that before anything else.
Replace standing tokens with short-lived credentials issued per task. Federate where possible.
Pin the agent's IAM policy to specific resources and actions. Add explicit denies for destructive operations the agent has no business calling.
Move every destructive tool behind an approval gate at the tool layer. Approval is a separate identity, logged.
Configure backups to a separate trust boundary with locked retention. Test the restore path against a real failure scenario.
Build a kill switch that revokes all agent credentials in one operator action. Practice using it.
Log every tool call with input, output, and approver. Send the log to a sink the agent cannot reach.

Run that list against the agent before it joins your stack. Run it again after every meaningful change to the agent's tool set or scope.

The speed of deployment vs the speed of governance

The pattern behind the PocketOS incident shows up in every team adopting agents quickly. Tools ship in days, tokens get reused, scope creeps, and nobody owns the full picture of what the agent can actually do.

This is the same operational gap we flagged in the agentic CVE disclosure post last week. Agent capabilities are growing faster than the controls around them, and the controls that work are the boring ones we already know how to build for human operators. Treat the agent like a privileged service account that occasionally has bad ideas. Give it the access a service account in that role would get and not a token more.

Teams that get this right are not smarter about prompts. They are older-school about access control.

Kai Token leads engineering at Fraktional. We help teams adopt AI in regulated environments by building the access controls, evals, and observability that make agents safe to deploy. The interesting work is rarely in the model.