Files

Kristóf Tóth 25f0037aab Filter environment variables in both sandbox modes

Whitelist mode now clears the parent env and re-adds a small allowlist
(identity, terminal, locale, proxy, non-GUI XDG, vendor prefixes).
Blacklist mode strips cloud credentials, backup passphrases, dangling
socket pointers, and anything matching *_TOKEN, *_SECRET, *_PASSWORD,
*_PASSPHRASE, *_API_KEY, *_PRIVATE_KEY, *_CLIENT_SECRET; vendor prefix
carve-outs keep ANTHROPIC_API_KEY and friends.

Users can override via --setenv KEY=VALUE and --unsetenv KEY (and the
corresponding TOML keys), or opt out of the built-in policy entirely
with --no-env-filter.

2026-04-08 09:22:11 +02:00

3.7 KiB

Raw Blame History

agent-sandbox

Sandbox agentic coding assistants with bubblewrap. Limits what an AI agent can see and modify on the host, reducing the blast radius of prompt injection and accidental damage.

Modes

Whitelist

Tight sandbox for normal agent coding tasks. Only explicitly listed paths are visible — system binaries, libraries, a subset of /etc, /sys (all read-only), synthetic /dev, private /proc, /tmp, /run, and the working directory (read-write). Everything else is invisible.

Blacklist

Looser sandbox for system-level debugging with agent assistance. The host filesystem is mounted read-only, with targeted overlays hiding sensitive paths (credentials, history, secrets, sockets, input devices). /run and ${XDG_RUNTIME_DIR} are replaced with tmpfs mounts that only expose the paths needed for system tooling (systemctl, resolvectl, journalctl, etc.).

The threat model is prompt injection and accidental damage, not a determined attacker with user-level access.

Not protected in blacklist mode: arbitrary readable files outside the sensitive paths list, and D-Bus method calls (access control is daemon-side).

Environment filtering

Both modes clamp the environment the child sees so prompt-injected agents can't printenv their way to secrets.

Whitelist clears the parent env and re-adds a small allowlist: identity/shell vars (HOME, PATH, …), terminal/locale, proxy, non-GUI XDG base dirs, and agent vendor prefixes (ANTHROPIC_*, CLAUDE_*, OPENAI_*, CODEX_*, GEMINI_*, OTEL_*).
Blacklist keeps the parent env but unsets credentials and dangling pointers: cloud creds (AWS_*, GOOGLE_APPLICATION_CREDENTIALS, …), backup tool passphrases, sockets stripped by path overlays (SSH_AUTH_SOCK, DISPLAY, GNUPGHOME, …), and anything matching *_TOKEN, *_SECRET, *_PASSWORD, *_PASSPHRASE, *_API_KEY, *_PRIVATE_KEY, *_CLIENT_SECRET. Vendor-prefix vars (ANTHROPIC_API_KEY etc.) are carved out so they survive.

Disable the built-in policy entirely with --no-env-filter (or env-filter = false in the config file) to pass the parent env through unchanged. User --setenv/--unsetenv escape hatches still apply.

Seccomp

Both modes apply a seccomp-BPF syscall allowlist derived from Podman's default profile. Dangerous syscalls (mount, unshare, ptrace, bpf, perf_event_open, io_uring_*, keyctl, kexec_*, …) return ENOSYS. Disable with --no-seccomp or seccomp = false in the config file.

Configuration file

Settings can be stored in a TOML config file at $XDG_CONFIG_HOME/agent-sandbox/config.toml (or pass --config <path>). Use --no-config to skip loading it. The config file accepts the same options as the corresponding CLI flags.

Top-level keys set defaults; [profile.<name>] sections define named presets selectable with --profile <name>. CLI flags always take highest precedence, followed by the active profile, then top-level defaults.

# Global defaults
whitelist = true
unshare-net = true
ro = ["~/.aws"]

# Named profile
[profile.docker]
blacklist = true
rw = ["/var/run/docker.sock"]
command = ["claude", "--dangerously-skip-permissions"]

Escape hatches

When the agent needs access to something the sandbox blocks, use --rw or --ro for paths and --setenv/--unsetenv for env vars. User overrides always win over the built-in policies.

agent-sandbox --rw /var/run/docker.sock -- claude --dangerously-skip-permissions
agent-sandbox --ro ~/.aws -- claude --dangerously-skip-permissions
agent-sandbox --setenv DATABASE_URL=postgres://localhost/dev -- claude
agent-sandbox --unsetenv HTTP_PROXY -- claude

3.7 KiB Raw Blame History