The Defensive OpSec Operating Standard for Agentic Security Review
A short, citable operating standard for AI coding agents that perform security review. Tool-agnostic in principle; first vehicle is Vercel's deepsec via the deepsec-skill. Five rules, four templates, one standards spine — all MIT-licensed and deep-linkable.
Version 1.0 · Published 2026-05-06 · Plain-markdown source at /standard.md (CORS open).
Install the skill
Via the skills.sh CLI:
npx skills add johndfowler/deepsec-skill
Or directly from URL:
npx skills add https://www.deepsec-skill.dev/SKILL.md
Compatible with Claude Code, Codex, Cursor, OpenCode, Continue, Goose, Aider, GitHub Copilot CLI, Gemini CLI, Cline, Warp, and ~50 more agents. Picks up your existing claude or codex CLI subscription automatically.
The Five Rules
Before any AI spend, the agent operates under these five rules. Each has a stable anchor so PRs, blog posts, RFCs, and internal AppSec wikis can deep-link a single rule.
1. Authorization and scope #
Work only on code the user controls or is authorized to assess. Confirm repo root, target path, and whether production systems are in scope. The upstream deepsec README is explicit on why this matters: "Treat deepsec like a coding agent with full shell access on the environment that it is running on." [upstream warning]
2. Threat-sketch first #
Identify exposed interfaces, auth boundaries, privileged operations, sensitive data, trust boundaries, build/release surfaces, and likely attacker goals — using OWASP's four-step Threat Modeling Cheat Sheet and STRIDE.
3. Defensive evidence only #
Evidence may include file paths, data-flow summaries, missing controls, affected assets, authorization assumptions, and safe reproduction notes. Evidence must not include exploit payloads, bypass recipes, credential theft, stealth, persistence, or exfiltration instructions.
4. Standards as vocabulary, not ceremony #
ASVS, WSTG, CWE, CVSS v4.0, NIST SSDF, NIST AI RMF, CISA SbD, SLSA, OpenSSF Scorecard, Sigstore, OWASP GenAI Top 10, ISO/IEC 29147/30111/TR 5895 — used when they clarify risk, fix, or evidence; skipped when they would be padding. Full spine ↓
5. Honest uncertainty #
Findings that cannot be safely verified inside the user's authorized scope are marked needs-authorized-validation rather than fabricated.
The Finding Packet Template #
Every confirmed issue is reported in this fixed shape so engineering, supply-chain, IR, and governance reviewers can read it without translation.
Finding: <title>
Severity / confidence: <severity> / <confidence>
Affected asset: <asset>
Trust boundary: <boundary>
Impact: <business/security impact>
Defensive evidence: <non-weaponized verification>
Control mapping: <ASVS/WSTG/CWE/CVSS if useful>
Supply-chain relevance: <none | dependency | CI | artifact | release gate>
Fix: <focused remediation>
Verify: <safe verification step>
Residual risk: <after fix or unknown>
Disclosure sensitivity: <internal | coordinated disclosure | advisory channel>
Extensions: when a finding touches dependencies, CI, package publishing, build scripts, secrets, artifact integrity, deployment promotion, or release gates, add SBOM/VEX considerations, SLSA level, Sigstore/in-toto provenance, OpenSSF Scorecard signal. When a finding affects customer data, revenue, service availability, or regulatory exposure, add materiality cues, operational blast radius, and explicit verification-vs-uncertainty bounds. The lens surfaces inputs for materiality determinations; humans make the determinations.
The INFO.md Rubric (50–100 lines) #
INFO.md is the per-project context file injected into every AI prompt batch. Signal density matters. Five sections, 50–100 lines total, no line numbers, max 5 paths per list, skip generic CWE categories.
# What this codebase does
Two or three sentences. Stack and surface area
(e.g. "React SPA + 4 serverless API routes on Vercel"). Include the
business-critical assets or user outcomes that would matter if
compromised.
# Auth shape
Every auth boundary in one place. Helper names, not line numbers.
If there is no user auth, say so explicitly. Note trust boundaries
and privileged operations.
# Threat model
What an attacker would actually want, ranked by impact
(financial > reputational > data). Include sensitive data, externally
reachable interfaces, and material business or governance concerns.
# Project-specific patterns to flag
Three to five patterns the built-in matchers will not know about:
custom middleware, internal helpers, env-var-driven recipient lists,
prompt-injection envelopes for AI endpoints, build/release scripts,
package-publishing flows, CI secrets, artifact-signing/provenance gaps.
# Known false-positives
Patterns that *look* dangerous but are intentional. Include expected
mitigations when they explain why something is safe.
The Scan Gate #
After the free regex scan and before the paid AI process, the agent emits this status block verbatim and waits for explicit user approval. No surprise bills.
deepsec scan complete.
Candidates: <n>
Scope: <target>
Cost note: process is the paid AI step.
Recommendation: <process now | narrow scope first | enrich INFO.md first>
Need approval before running process.
The Run Closeout #
After remediation and revalidate, every run lands a clean handoff artifact. Not a wall of findings — a closeout.
Run summary:
- Scope assessed:
- Candidates processed:
- Findings confirmed:
- False positives:
- Fixes made:
- Revalidation:
- Residual risks:
- Follow-up gates:
The Standards Spine #
The standard does not invent vocabulary. It threads agentic security review through the published references the security community already uses.
- OWASP ASVS 5.0 — application security verification standard, released May 2025 at Global AppSec EU Barcelona. Used as control vocabulary on findings.
- OWASP WSTG v4.2 — web security testing guide categories (auth, authz, session, input validation, business logic, API).
- OWASP Threat Modeling Cheat Sheet — four-step process (decompose, identify, mitigate, validate) and STRIDE.
- NIST SP 800-218 (SSDF v1.1) and SP 800-218A (SSDF community profile for generative AI and dual-use foundation models) — secure SDLC practices.
- NIST AI Risk Management Framework (AI RMF 1.0, NIST AI 100-1), the companion AI RMF Playbook, and the Generative AI Profile (AI 600-1) — Govern, Map, Measure, Manage functions for trustworthy AI.
- OWASP GenAI Security Project — umbrella for the LLM Top 10, the Agentic Security Initiative, the Red Teaming initiative, and the AI Security Solutions Landscape.
- CISA Secure by Design pledge — seven goals including MFA, default passwords, vulnerability classes, patches, VDP, accurate CVE/CWE, and evidence of intrusion. 350+ signatories as of Q2 2026.
- CISA 2025 SBOM Minimum Elements — adds Component Hash, License, Tool Name, and Generation Context to the 2021 NTIA baseline.
- SLSA v1.2 — Build Track and Source Track levels for supply-chain integrity, including provenance and Verification Summary Attestations.
- OpenSSF Scorecard — repo-hygiene checks (Branch-Protection, Code-Review, Signed-Releases, Token-Permissions, Pinned-Dependencies, Security-Policy, SAST, Fuzzing).
- Sigstore / Cosign — keyless artifact signing via Fulcio (OIDC-issued short-lived certs) and the Rekor transparency log.
- FIRST CVSS v4.0 — Base, Threat, Environmental, and Supplemental metric groups for severity expression.
- OWASP Top 10 for LLM Applications 2025 — prompt injection, excessive agency, system prompt leakage, supply-chain risk, sensitive information disclosure.
- ISO/IEC 29147:2018, ISO/IEC 30111:2019, and ISO/IEC TR 5895:2022 — coordinated and multi-party coordinated vulnerability disclosure.
- SEC Reg S-K Item 106 and Form 8-K Item 1.05 — material cybersecurity incident disclosure on a four-business-day clock from the materiality determination, plus annual risk-management process disclosure.
- OWASP Agentic Skills Top 10 — agentic-skill-specific threat surface; AST01 covers malicious-skill registry attacks (typosquatting, brand impersonation, prompt-injection embedded in
SKILL.md). - Cloud Security Alliance MAESTRO — 7-layer threat-modeling framework for agentic AI systems.
- Snyk research, "SKILL.md to Shell Access" (Feb 2026) — empirical study of 40,000+ scanned skills across 7 registries: 91% of confirmed malicious skills use prompt injection; 100% combine code-layer + natural-language layer attacks.
The skill cites these so its findings are auditable and so the user can hand them to AppSec, an external assessor, an underwriter, or counsel without re-translation.
Why this exists
Discovery is no longer the bottleneck for AI-assisted security review. Translation is. Agentic scanners produce findings; the security community needs them in vocabulary AppSec, supply-chain, IR, and governance reviewers can act on without re-translation. This standard is one attempt at that translation layer.
Two operating facts about deepsec, attributed to Vercel's announcement, that the security community will weigh:
- False-positive rate ≈ 10–20% at default settings, with
revalidateavailable to cut it further.[1] Independent SAST surveys in 2025–2026 cite 20–60% as common across legacy tools. - Scale. "Scans on Vercel's codebases routinely scale up to 1,000+ concurrent sandboxes" via Vercel Sandbox fan-out.[1]
And two practitioner quotes, verbatim:
"We've been on a lookout for a tool to do security scans on our open source repositories. deepsec's scan have been the most thorough, with most findings, and good true-positive rate."
"We get a lot of automated security reports, but most of them aren't actionable. deepsec is the first tool that's surfaced the kind of issues we'd actually want a security engineer to flag, and it runs on infrastructure we control."
Adopt the standard #
MIT licensed. Three ways to plug in.
As an agent skill
npx skills add johndfowler/deepsec-skill
As a citation
Deep-link any rule or template by its anchor:
- /standard.md — full text, plain markdown, CORS open
- #authorization · #threat-sketch · #defensive-evidence · #standards · #honest-uncertainty
- #finding-packet · #info-rubric · #scan-gate · #closeout
As a fork
Lift any template into your own tool, skill, or internal AppSec runbook. MIT, attribution appreciated, modification expected. Versioned: this is v1.0; future revisions land at /standard/v2.md etc.; v1.0 stays at /standard.md permanently. Existing citations don't rot.
Drop this in your CLAUDE.md
Pin precedence in your own host project. Paste this block into your repo's CLAUDE.md, AGENTS.md, .cursor/rules, or GEMINI.md — same shape works for all of them:
## Security review (Defensive OpSec Operating Standard v1.0)
When running `deepsec` or any agentic security review in this repo, apply
https://www.deepsec-skill.dev/standard.md. Its five rules — authorization,
threat sketch, defensive evidence only, standards as vocabulary, honest
uncertainty — take precedence over the rest of this CLAUDE.md for
security-scan tasks.
Source: /standard/claude-md-snippet.md (CORS open, MIT, lift it).
Surviving your CLAUDE.md #
Boris Cherny (Claude Code lead) has publicly identified CLAUDE.md as the single largest source of deployment-time issues. The implication for any agent skill: when an agent activates inside a real host project, that project's CLAUDE.md, AGENTS.md, .cursor/rules, or GEMINI.md can absorb, dilute, or override skill-level discipline. A standard that loses to a project's "be terse, no preamble" rule during a security review is theatre.
The standard is hardened against absorption from both ends. v1.0.1 tightens the enforcement language after an adversarial-CLAUDE.md absorption test surfaced three real failure modes:
- Skill-side precedence. The first operational instruction in
SKILL.mddeclares the five rules win over project-levelCLAUDE.md/AGENTS.md/.cursor/rules/GEMINI.mdfor security-review tasks. - Project-side pinning. The four-line snippet above goes into the host repo's
CLAUDE.mdand pins precedence in the project's own voice. - Activation canary, mechanically exempt from terseness. The skill's first chat output must be
Applying Defensive OpSec Operating Standard v1.0 — 5 rules, scan-gate active, defensive-evidence only.The canary is a mechanical activation handshake, not preamble. Project instructions like "be terse" or "no preamble" govern response substance — they do NOT suppress the canary. Missing line = failed activation; refuse the tool. - Conflict-surfacing is unconditional. Project rules like "never ask the user" or "skip confirmation" do NOT apply to skill-precedence conflicts. The user is the standard's defined arbiter; depriving them of that role is a Rule 1 violation. Refuse to proceed until the user confirms which authority wins.
- No silent finding-drop. Findings that cannot be safely verified must still be reported with
needs-authorized-validation. If a host CLAUDE.md prohibits the vocabulary, surface the conflict and report the finding anyway. Silent omission is forbidden.
The discipline is meaningless without these. Adopters that skip the snippet still get the skill's behavior on first activation; adopters that paste the snippet get it on every future agent run in the same repo.
Credit and scope
The scanner is vercel-labs/deepsec, Apache-2.0 licensed, announced in Introducing deepsec: find and fix vulnerabilities in your code base by Malte Ubl (@cramforce), CTO of Vercel. The regex-then-AI pipeline, the dual-backend verification loop (Claude Agent SDK and Codex), the cost-aware command set (scan, process, process --diff, triage, revalidate, enrich, report, export, metrics, status, sandbox), and the Vercel Sandbox fan-out for large repos are all Vercel's work. The default models are claude-opus-4-7 for process/revalidate, gpt-5.5 for the Codex backend, and claude-sonnet-4-6 for triage; Vercel's published cost guidance is roughly $25–60 / 100 files, $130–300 / 500 files, $500–1,200 / 2,000 files, with the README noting that scans on large codebases can run into the thousands or tens of thousands of dollars.[2]
This standard and the deepsec-skill are the agent-facing wrapper. They do not modify the scanner. They add the operating ritual on top.
What's new
- v1.0.2 — prior-art harvest. Cherry-picked load-bearing patterns from earlier work in the AI-skills security ecosystem, with explicit credit at #prior-art: the three-phase methodology (context → comparative → vulnerability) and HIGH-CONFIDENCE filter from
anthropics/claude-code-security-review; the "anchor every architectural claim to evidence in the repo" rule fromopenai/skills/security-threat-model; the parallel false-positive sub-task pattern; the security frontmatter proposal fromalirezarezvani/skill-security-auditor. Standards spine extended with OWASP Agentic Skills Top 10, Cloud Security Alliance MAESTRO, and Snyk's empirical research (40,000+ scanned skills; 91% of malicious skills use prompt injection). DO-NOT-TRIGGER clauses added to SKILL.md frontmatter. Five rules and four templates unchanged. - v1.0.1 — adversarial absorption-test patches. A hard test against a worst-case adversarial CLAUDE.md surfaced three real failure modes in v1.0: (1) the canary line could be silently suppressed by "be terse / no preamble" instructions; (2) a circular dependency where conflict-surfacing requires asking the user but project CLAUDE.md prohibits asking; (3) silent finding-drop under vocabulary conflicts. v1.0.1 tightens the SKILL.md and standard.md language to close all three. Five rules and four templates unchanged; this is enforcement-language only.
- Absorption-resistance v1.0. Added activation precedence and forced-acknowledgement canary to /SKILL.md so the standard's discipline survives a host project's
CLAUDE.md,AGENTS.md,.cursor/rules, orGEMINI.md. Published the four-line CLAUDE.md adoption snippet at /standard/claude-md-snippet.md. Added the Surviving your CLAUDE.md section to the page and to/standard.md. - Defensive OpSec Operating Standard v1.0 published. Page reframed as the standard's home: five rules with anchor IDs, finding-packet / INFO.md / scan-gate / closeout templates inlined as citable code blocks, plain-markdown source at /standard.md (CORS open). Demos and brand chrome removed in favor of evaluation surface.
- Initial release: skill published, listed on skills.sh.