Apply a seccomp-BPF syscall allowlist by default

Derived from Podman's default profile, stripped of capability-conditional
rules (we never grant capabilities), argument filters, and the explicit
EPERM block. Dangerous syscalls (mount, unshare, ptrace, bpf,
perf_event_open, io_uring_*, keyctl, kexec_*, ...) fall through to the
default ENOSYS action, which also keeps glibc's clone3 -> clone fallback
working. x86_64 and aarch64 are supported; other archs error out.

Toggle with --seccomp / --no-seccomp or seccomp = <bool> in config.
This commit is contained in:
2026-04-08 08:34:34 +02:00
parent 5f3b139457
commit 12644ae31e
11 changed files with 772 additions and 0 deletions

View File

@@ -34,6 +34,14 @@ pub struct Args {
#[arg(long, overrides_with = "unshare_net")]
pub share_net: bool,
/// Enable seccomp syscall filtering (on by default; overrides config-file `seccomp = false`)
#[arg(long, overrides_with = "no_seccomp")]
pub seccomp: bool,
/// Disable seccomp syscall filtering (overrides config-file `seccomp = true`)
#[arg(long, overrides_with = "seccomp")]
pub no_seccomp: bool,
/// Bind an extra path read-write (repeatable)
#[arg(long = "rw", value_name = "PATH", action = clap::ArgAction::Append)]
pub extra_rw: Vec<PathBuf>,