Back to Blog

OpenClaw Security Guide: Locking Down AI Coding Agents with Tailscale and Sandboxing

Secure OpenClaw and Claude Code with OS-level sandboxing, Tailscale networking, and permission hardening. A practitioner checklist for self-hosted AI agents.

OpenClaw Security Guide: Locking Down AI Coding Agents with Tailscale and Sandboxing
Kai Token
Kai Token
10 Feb 2026 · 13 min read

Your AI Coding Agent Has Root-Level Power

Here's the uncomfortable truth about running AI coding agents like Claude Code and OpenClaw: by default, they operate with the full privileges of your user account. They can read your SSH keys, execute arbitrary shell commands, modify system files, and make network requests to any domain on the internet.

That's not a theoretical risk. It's the default behavior.

Claude Code is Anthropic's agentic coding tool that lives in your terminal. It reads your codebase, writes files, runs commands, and manages git workflows through natural language. OpenClaw takes this a step further - it's an open-source framework that wraps Claude (and other LLMs) into a persistent AI assistant accessible through WhatsApp, Telegram, Discord, and other messaging platforms. Where Claude Code is a terminal session, OpenClaw is a 24/7 agent with browser control, file system access, and shell execution capabilities.

Both tools are genuinely useful. Both tools are genuinely dangerous if you run them without thinking about security boundaries.

This guide covers the practical steps to lock them down: OS-level sandboxing, network isolation with Tailscale, permission configurations, and the specific settings that matter.

What OpenClaw Actually Is

OpenClaw started as "Clawdbot," then became "Moltbot," and has settled into its current name as an open-source personal AI assistant. The project is maintained by Peter Steinberger and a growing community on GitHub.

At its core, OpenClaw runs a local gateway process - a WebSocket control plane at ws://127.0.0.1:18789 - that coordinates between your AI model (typically Claude), your messaging apps, and your local system. You install it globally via npm:

npm i -g openclaw
openclaw onboard --install-daemon

The onboarding wizard walks you through connecting messaging channels, selecting your AI provider, and configuring system access. Once running, you can text your assistant on WhatsApp to push code, manage emails, browse websites, or run shell commands on your machine.

That last part is what should make you pause.

The Access Problem

By default, OpenClaw operates with full host access for direct personal use. The project documentation describes "DM pairing" as the security boundary for messaging platforms - unknown senders get a pairing code before the bot processes their messages. But once paired, the agent can:

  • Execute shell commands with your user privileges
  • Read and write files anywhere your user account can
  • Control a Chrome browser instance
  • Access your file system, calendar, and email
  • Make outbound network requests to any destination

Claude Code has a similar profile. When you run claude in your terminal, it inherits your shell environment. It can read ~/.ssh/, access ~/.aws/credentials, modify your ~/.bashrc, and execute any command you could type manually.

The security model for both tools amounts to: "trust the AI not to do bad things." That's not a security model. That's wishful thinking.

The Threat Model You Should Actually Worry About

Before configuring anything, understand what you're defending against. The primary threats aren't malicious AI - they are:

Prompt injection through untrusted content. When Claude Code reads a file from a cloned repository, that file could contain instructions designed to manipulate the agent's behavior. A malicious README.md or build script could instruct Claude to exfiltrate your credentials. This isn't hypothetical - prompt injection attacks against coding agents are well-documented.

Supply chain poisoning. Running npm install inside a sandbox that has network access means every dependency in your project could execute arbitrary code. If a compromised package runs during installation and the agent has full network access, your SSH keys could be sent to an attacker's server before anyone notices.

Credential exposure. AI agents need to read code to be useful. But they don't need to read .env files, AWS credentials, or SSH private keys. Without explicit deny rules, nothing prevents the agent from reading these files and including their contents in API calls to the model provider.

Unintended side effects. Even without malicious intent, an agent that can run rm -rf or git push --force can cause serious damage through misunderstanding a prompt. The blast radius of a mistake scales with the agent's permissions.

Real CVEs: These Risks Are Documented

Two high-severity CVEs hit Claude Code in 2025, both discovered by security researcher Elad Beber:

CVE-2025-54794 (CVSS 7.7) - Path Restriction Bypass. Claude Code validated file paths by checking if a path started with an allowed directory prefix. An attacker could create a directory like /tmp/allowed_dir_malicious that matched the prefix check, tricking Claude into reading secrets outside the intended boundary. Fixed in version 0.2.111 with canonical path comparison.

CVE-2025-54795 (CVSS 8.7) - Command Injection. A parsing error in Claude Code's confirmation system allowed bypassing the approval prompt to execute untrusted commands. The displayed command did not match the executed command. Fixed in version 1.0.20 with improved input sanitization.

In January 2026, Flatt Security demonstrated 8 distinct methods to bypass Claude Code's command blocklist (CVE-2025-66032). The attacks exploited allowlisted "safe" commands like man, sort, sed, history, and git:

  • man --html="touch /tmp/pwned" man - HTML rendering option executed arbitrary code
  • sort -S 1b --compress-program "sh" - compression program option invoked a shell
  • history -s "malicious"; history -a ~/.bashrc - wrote to shell config files
  • echo test | sed 's/test/command/e' - sed's e flag executed substitution results
  • Bash ${var@P} prompt expansion chains executed command substitutions

Anthropic responded by switching from a blocklist to an allowlist approach in v1.0.93. The lesson: blocklists are fundamentally brittle for command security.

The --dangerously-skip-permissions Flag

Claude Code's most dangerous mode is --dangerously-skip-permissions, which bypasses all permission prompts for unattended execution. The risks:

  • File deletions and modifications happen without confirmation
  • Network requests fire without approval
  • A single misinterpreted instruction can cascade into system damage

If you must use it (CI/CD pipelines are the valid use case), isolation is mandatory: run inside a Docker container with --network none and --cap-drop ALL. Never on a developer workstation.

Step 1: Enable OS-Level Sandboxing in Claude Code

Claude Code ships with a built-in sandbox that uses operating system primitives for enforcement. On macOS, it uses Apple's Seatbelt framework. On Linux and WSL2, it uses bubblewrap. This isn't application-level filtering - it's kernel-level isolation that applies to every child process the agent spawns.

macOS Setup

Sandboxing works out of the box on macOS. Enable it by running:

claude
> /sandbox

Select "Auto-allow mode" from the menu. This lets sandboxed commands run without permission prompts while enforcing filesystem and network restrictions. Commands that try to escape the sandbox fall back to the regular permission flow.

Linux/WSL2 Setup

Install the dependencies first:

# Ubuntu/Debian
sudo apt-get install bubblewrap socat

# Fedora
sudo dnf install bubblewrap socat

Then enable sandboxing the same way through the /sandbox command in Claude Code.

Configure Sandbox Settings

Add sandbox configuration to your ~/.claude/settings.json:

{
  "sandbox": {
    "enabled": true,
    "autoAllowBashIfSandboxed": true,
    "allowUnsandboxedCommands": false,
    "excludedCommands": ["docker"],
    "network": {
      "allowedDomains": [
        "github.com",
        "*.githubusercontent.com",
        "*.npmjs.org",
        "registry.yarnpkg.com",
        "pypi.org"
      ],
      "allowLocalBinding": true
    }
  }
}

The critical setting here is "allowUnsandboxedCommands": false. By default, Claude Code has an escape hatch where commands that fail inside the sandbox can retry outside it with the dangerouslyDisableSandbox parameter. Setting this to false eliminates that escape hatch entirely. Every command either runs sandboxed or gets routed through the excludedCommands list.

What the Sandbox Actually Enforces

Filesystem: The agent can read and write within the current working directory and its subdirectories. It can't modify files outside that tree. Attempts to write to ~/.bashrc, /usr/local/bin, or any path outside the project directory are blocked at the OS level.

Network: All outbound connections route through a proxy running outside the sandbox. Only domains in your allowedDomains list can be reached. Attempts to connect to unlisted domains are blocked and you get notified.

Process inheritance: These restrictions apply to every subprocess. If Claude Code runs npm install and a package's postinstall script tries to curl to an unauthorized domain, it gets blocked.

Step 2: Lock Down Permissions

Sandboxing handles filesystem and network boundaries for bash commands. Permissions handle everything else - which tools Claude Code can use, which files it can read through its own tools, and which domains it can fetch through WebFetch.

Create or edit .claude/settings.json in your project root:

{
  "permissions": {
    "allow": [
      "Bash(npm run lint)",
      "Bash(npm run test *)",
      "Bash(npm run build)",
      "Bash(git status)",
      "Bash(git diff *)",
      "Bash(git log *)"
    ],
    "deny": [
      "Read(.env)",
      "Read(.env.*)",
      "Read(./secrets/**)",
      "Read(**/.env)",
      "Read(**/.env.*)",
      "Read(~/.ssh/**)",
      "Read(~/.aws/**)",
      "Read(~/.config/gh/**)",
      "Bash(curl *)",
      "Bash(wget *)",
      "Bash(git push *)",
      "Bash(git push)",
      "Bash(rm -rf *)"
    ],
    "ask": [
      "Bash(git commit *)",
      "Bash(git checkout *)",
      "WebFetch"
    ]
  }
}

The deny list is evaluated before allow rules - anything matching a deny pattern is blocked regardless of what the allow list says.

Pay attention to the Read deny rules. Without them, Claude Code can read your environment files through its built-in file reading tool, completely bypassing the filesystem sandbox (which only applies to bash commands). Denying Read(~/.ssh/**) prevents the agent from including your private keys in any context.

Enterprise Managed Settings

For teams, place a managed-settings.json file in the system-level configuration directory:

# macOS
/Library/Application Support/ClaudeCode/managed-settings.json

# Linux
/etc/claude-code/managed-settings.json

Managed settings can't be overridden by user or project settings. This is how you enforce security policy across an engineering team:

{
  "permissions": {
    "deny": [
      "Read(**/.env)",
      "Read(**/.env.*)",
      "Read(~/.ssh/**)",
      "Read(~/.aws/**)"
    ]
  },
  "sandbox": {
    "enabled": true,
    "allowUnsandboxedCommands": false
  }
}

Step 3: Network Isolation with Tailscale

Sandboxing restricts which domains the agent can reach. Tailscale adds another layer: controlling how the agent accesses network resources and routing AI API traffic through an auditable gateway.

Why Tailscale

Tailscale creates a WireGuard-based mesh VPN (a "tailnet") where every device gets a stable identity. There's no central server to route traffic through - devices connect directly to each other using encrypted tunnels. The relevance for AI agents is threefold:

  1. Identity-based access - instead of distributing API keys to every developer machine, route AI traffic through a gateway that uses Tailscale identity for authentication
  2. Network segmentation - restrict which network resources the agent can reach based on ACLs
  3. Audit logging - every connection attempt is logged with device and user identity attached

Set Up Tailscale

Install Tailscale on your development machine:

# macOS
brew install tailscale

# Ubuntu/Debian
curl -fsSL https://tailscale.com/install.sh | sh

# Start and authenticate
sudo tailscale up

Tailscale Aperture for AI Gateway

Tailscale's Aperture product (currently in alpha) is purpose-built for managing AI agent traffic. It acts as an API gateway that sits between your coding agents and LLM providers.

For Claude Code, the configuration in ~/.claude/settings.json looks like:

{
  "env": {
    "ANTHROPIC_BASE_URL": "http://ai"
  },
  "apiKeyHelper": "echo dummy-key-aperture-handles-auth"
}

With Aperture, you get:

  • No API key distribution - Aperture uses Tailscale identity to authenticate requests. Individual developers never handle raw API keys.
  • Usage tracking - every request is logged with user identity, token counts, model used, and tool invocations.
  • Cost monitoring - track spending per developer, per team, per project.
  • Anomaly detection - unusual patterns (sudden spikes in token usage, requests to unexpected models) surface automatically.

For CI/CD environments, Aperture integrates with GitHub's federated OIDC. A CI agent can join the tailnet with a tag like tag:claude-pr-reviewer without needing a Tailscale key, and all its AI API traffic flows through the auditable gateway.

ACL Configuration for Agent Isolation

Define Tailscale ACLs to restrict what your AI agent machines can access:

{
  "acls": [
    {
      "action": "accept",
      "src": ["tag:dev-workstation"],
      "dst": ["tag:aperture-gateway:443"]
    },
    {
      "action": "accept",
      "src": ["tag:dev-workstation"],
      "dst": ["tag:git-server:22", "tag:git-server:443"]
    }
  ],
  "tagOwners": {
    "tag:dev-workstation": ["group:engineering"],
    "tag:aperture-gateway": ["group:platform-team"],
    "tag:git-server": ["group:platform-team"]
  }
}

This ACL configuration means development workstations can only reach two things: the Aperture AI gateway and the git server. No direct access to production databases, internal APIs, or other infrastructure. If a compromised agent tries to reach a production server, the connection is rejected at the network level.

Step 4: Securing OpenClaw Specifically

OpenClaw has its own security considerations beyond Claude Code because it runs as a persistent daemon with messaging platform integrations.

Enable Sandbox Mode

In your OpenClaw configuration, enable Docker-based sandboxing for non-main sessions. Note that exact configuration keys may vary between OpenClaw versions - check the documentation for your installed version:

agents:
  defaults:
    sandbox:
      mode: "non-main"

This isolates group chat and channel sessions in Docker containers, preventing them from accessing host resources. Direct message sessions still run on the host by default - for maximum security, consider running the entire OpenClaw gateway inside a container.

Restrict Messaging Access

OpenClaw supports allowlists for messaging platforms. Configure these to limit who can interact with your agent:

messaging:
  allowlist:
    - "+1234567890"    # your phone number
  mentionGating: true  # require @mention in group chats

The mentionGating setting prevents the agent from processing every message in a group chat - it only activates when explicitly mentioned. Without this, any message in a connected group could trigger agent actions.

Run OpenClaw in a Container

The strongest isolation approach for OpenClaw is running the entire gateway in a Docker container with limited capabilities:

FROM node:22-slim

RUN npm i -g openclaw

# Create non-root user
RUN useradd -m -s /bin/bash ocuser
USER ocuser
WORKDIR /home/ocuser

# Only mount the specific workspace directory
VOLUME ["/home/ocuser/workspace"]

EXPOSE 18789

CMD ["openclaw", "start"]

Run it with restricted capabilities:

docker run -d \
  --name openclaw \
  --cap-drop ALL \
  --security-opt no-new-privileges \
  -v ~/projects/workspace:/home/ocuser/workspace \
  -p 127.0.0.1:18789:18789 \
  openclaw-sandboxed

The --cap-drop ALL removes all Linux capabilities. The --security-opt no-new-privileges prevents privilege escalation. The volume mount limits filesystem access to a single workspace directory.

Step 5: The Open Source Sandbox Runtime

Anthropic released their sandbox implementation as an open-source package under the Apache 2 license. The @anthropic-ai/sandbox-runtime npm package provides the same OS-level isolation used in Claude Code, available for any Node.js application.

You can use it to sandbox any process - not just Claude Code. For example, sandboxing an MCP server:

npx @anthropic-ai/sandbox-runtime <command-to-sandbox>

The project lives at github.com/anthropic-experimental/sandbox-runtime and provides:

  • Filesystem isolation using Seatbelt (macOS) or bubblewrap (Linux)
  • Network isolation through a proxy mechanism
  • Domain allowlisting for outbound connections
  • Process inheritance so child processes inherit restrictions

If you're building your own AI agent tooling or want to sandbox arbitrary processes in your development environment, this is a solid foundation. It's the same battle-tested code that Claude Code uses internally, extracted into a standalone package.

A Practical Security Checklist

Here's a condensed checklist for teams deploying AI coding agents:

Sandbox Configuration

  • [ ] Enable OS-level sandboxing (sandbox.enabled: true)
  • [ ] Disable the unsandboxed escape hatch (allowUnsandboxedCommands: false)
  • [ ] Allowlist only necessary network domains
  • [ ] Exclude incompatible commands explicitly rather than disabling the sandbox

Permission Rules

  • [ ] Deny Read access to all credential files (.env, .ssh, .aws)
  • [ ] Deny destructive bash commands (rm -rf, git push --force)
  • [ ] Put mutating git operations in the ask tier
  • [ ] Use managed settings for team-wide policy enforcement

Network Security

  • [ ] Route AI API traffic through Tailscale Aperture or equivalent gateway
  • [ ] Configure Tailscale ACLs to limit agent network reach
  • [ ] Log all AI API requests with user/device identity
  • [ ] Monitor for unusual token usage patterns

OpenClaw Specific

  • [ ] Enable sandbox mode for non-main sessions
  • [ ] Configure messaging allowlists
  • [ ] Enable mention gating for group chats
  • [ ] Consider running the gateway in a Docker container

Operational Hygiene

  • [ ] Review sandbox violation logs weekly
  • [ ] Audit MCP server configurations monthly
  • [ ] Keep Claude Code and OpenClaw updated for security patches
  • [ ] Test security configurations in a clean environment before deploying
  • [ ] Set transcript retention to 7-14 days for sensitive environments

How Fraktional Secures Its AI Development

At Fraktional, we run Claude Code as a core part of our development workflow across a T3 Turbo monorepo with Next.js 16, tRPC, Drizzle ORM, and 12+ integration plugins. Here is how we configure it:

Sandboxing is always on. Every developer has sandbox.enabled: true in their user settings. A managed-settings.json deployed via MDM enforces this so individual developers cannot disable it.

Credential files are denied at the managed level. Our managed settings deny Read access to .env, .env.*, ~/.ssh/**, and ~/.aws/**. These rules cannot be overridden by project or user settings.

Tailscale for all remote work. Remote development VMs are accessible only via Tailscale SSH. No public-facing ports. API traffic routes through a centralized gateway for audit logging.

CI/CD runs in locked containers. GitHub Actions workflows that use Claude Code run in containers with --network none and --cap-drop ALL. Only the repository directory is mounted. No secrets are accessible inside the container.

The principle: treat the AI agent the same way you would treat an untrusted contractor. Give it access to the codebase, not the infrastructure. Let it write code, not deploy it.

The "Treat It Like an Untrusted Intern" Principle

The best mental model for AI coding agents comes from the Backslash Security team: treat the agent like an untrusted but capable intern.

An intern gets access to the codebase but not production credentials. They can run tests but not deploy to production. They can commit code but someone reviews it before it merges. They work in a shared office but don't have keys to the server room.

That's what your security configuration should enforce. Not because the AI is malicious, but because any system with broad access is one prompt injection away from becoming a liability.

The tools exist to do this properly. Claude Code's native sandboxing, Tailscale's network isolation, and explicit permission configurations give you defense in depth. OpenClaw's containerized mode and messaging allowlists add the additional controls needed for a persistent agent.

Use them. Your SSH keys will thank you.

Further Reading

Related Articles

From seamless integrations to productivity wins and fresh feature drops—these stories show how Pulse empowers teams to save time, collaborate better, and stay ahead in fast-paced work environments.