# Methodology: Exa-driven reference discipline

A repeatable playbook for building, cross-referencing, and maintaining the citation layer that supports the **Defensive OpSec Operating Standard**, the deepsec agent skill, and every specimen this project publishes.

- Methodology version: 1.0 (2026-05-07)
- Standard applied: <https://www.deepsec-skill.dev/standard.md> v1.1.1
- Recommended search primitive: [Exa](https://exa.ai) MCP (`web_search_exa`, `web_fetch_exa`)
- Equivalents accepted: Brave Search, Tavily, Perplexity, direct WebFetch chains, or any web-search MCP that supports keyword + neural retrieval and returns canonical URLs
- Output schema: `references.json` at the project root

This document is the **canon** for how citations are produced and verified across the toolkit. Every claim that appears in `standard.md`, `deepsec/SKILL.md`, or any `specimens/*.md` traces back to an entry in `references.json` whose `verified_via` / `verified_on` fields were set per the rules below.

---

## Why a methodology

Three failure modes drive the discipline:

1. **Silent rot:** markdown links 404 over time. Without a `verified_on` cadence, half a year later the standards spine looks authoritative but half the URLs no longer resolve.
2. **One-source bias:** a single article repeated in 10 specialist outlets is still one source. Without independent triangulation, "five citations" can collapse to one wire-service feed.
3. **Tool-locked claims:** when the skill cites a NIST or ISO document at run time, it should be able to verify the URL, not assert it from training-data memory. Web-search MCPs make that cheap.

The methodology is opinionated about *how* to cite, agnostic about *which tool*. Exa is the default because it's what produced the existing corpus, but Rule 4 of the standard (standards as vocabulary, not ceremony) extends to tooling: the discipline is non-negotiable, the brand is not.

---

## The five-tier source classification (canon)

Every entry in `references.json` carries a `tier` from 1 to 5. The tiers were forged in the stablecoin specimen and are now canon:

| Tier | Definition | Examples |
|---|---|---|
| **1** | Primary regulator, legislator, official register, or central-banker speech on an official channel | congress.gov, hkma.gov.hk, mas.gov.sg, bis.org/review, fca.org.uk, sec.gov |
| **2** | Issuer official disclosure (S-1, transparency page, attestation PDF, press release on the issuer's own domain) | circle.com/transparency, tether.io/news, GitHub README, Apache-licensed source |
| **3** | Standards body or IGO working paper (NIST, OWASP, FSB, IMF, BIS, S&P stability assessment) | csrc.nist.gov, owasp.org, imf.org/publications, bis.org/publ |
| **4** | Major journalism or legal commentary (Bloomberg, Reuters, FT, Coindesk, Mayer Brown, Debevoise) | bloomberg.com, coindesk.com, mayerbrown.com |
| **5** | Specialist or regional press (StablecoinLaws, blockeden.xyz, regional outlets) | stablecoinlaws.org, koreatimes.co.kr, soberano.news |

A claim is **load-bearing** if removing it changes the conclusion of a finding, threat sketch, or run closeout. Load-bearing claims must triangulate to ≥ 2 independent Tier-1/2/3 sources. Tier 4 / 5 sources support context, colour, and time-window pinning, but cannot be the sole source for a load-bearing claim.

---

## Query design conventions

Exa rewards specific, evidence-shaped queries. The conventions:

1. **Parallel fan-out, not serial probing.** When assembling a corpus, dispatch 6–10 queries in a single tool-use block, each targeting a distinct facet (geography, jurisdiction, time window, primary actor). Results triangulate naturally.
2. **Quoted phrases for unambiguous strings.** Bill numbers (`"S.1582"`), DOIs (`"10.5089/9798229042246.001"`), public-law numbers (`"Public Law 119-27"`), specific dates and vote counts. Quotes prune the long tail of fuzzy matches.
3. **Author + outlet for academic work.** `"Aldasoro" "Beltran" stablecoin parity deviation IMF working paper 2026` returns the working paper, RePEc cross-listing, and downstream summaries (three independent surfaces).
4. **Geographic balance check.** When a corpus spans jurisdictions, run one query per jurisdiction with native-language keywords (or English transliterations). Single-language search produces a US/UK-tilted corpus by default.
5. **Time window discipline.** Append a year to queries when a claim is time-boxed (`stablecoin Hong Kong licence "April 2026"`). Stale results are easy to spot.
6. **Avoid ambiguous slugs.** `deepsec-skill` collides with DeepSeek (Chinese AI lab) on fuzzy match; use `deepsec-skill.dev` (TLD-suffixed) when querying for self-citations. The lesson generalises: pick query terms that can't fuzzy-match unrelated targets.

---

## Triangulation rule

Every load-bearing claim cited in any artefact must satisfy:

> **At least two independent Tier-1/2/3 sources confirm the claim.**

Independence requirements:
- Two outlets that re-publish the same wire feed do **not** count as two independent sources.
- An issuer's own press release (Tier 2) **paired with** an independent regulator filing (Tier 1) **counts** as two independent sources.
- Two journalism outlets (Tier 4) on a story sourced from the same anonymous "people familiar with the matter" do **not** count as two independent sources unless one outlet adds a verifiable primary detail (filing number, date, signed quote on the record).

Failures of triangulation are surfaced under Rule 5 (Honest uncertainty); never silently downgraded.

---

## The `references.json` schema

```json
{
  "id": "genius-act-bill",
  "tier": 1,
  "region": "US",
  "url": "https://www.congress.gov/bill/119th-congress/senate-bill/1582",
  "title": "S.1582 GENIUS Act bill text and status (Public Law 119-27)",
  "verified_via": "exa",
  "verified_on": "2026-05-07",
  "archive_url": null
}
```

Field rules:

- **`id`:** stable, hyphen-slug, lowercase. Short enough to cite inline (`ref:genius-act-bill`), specific enough to disambiguate. Once issued, never reassigned.
- **`tier`:** integer 1–5 per the canon table.
- **`region`:** ISO-2 country code where the source originates, or `EU` / `Global`. Used for coverage-balance audits.
- **`url`:** the canonical URL. If the source has both a landing page and a PDF, prefer the landing page; PDF goes in `archive_url`.
- **`title`:** the human-readable title as it appears at the URL. Update if the source title changes.
- **`verified_via`:** `"exa"` | `"head"` | `"manual"` | other tool slug | `null`. `"exa"` is the recommended primitive (an actual web-search MCP confirmation); `"head"` is a direct URL liveness check via HTTP HEAD (cheap fallback for stable government / standards-body URLs); `"manual"` is content-confirmation when the URL is reachable but blocks scrapers (e.g., SEC anti-bot 403). Other slugs accepted for equivalent tools (`"brave"`, `"tavily"`, `"perplexity"`). `null` means the entry exists but has not yet been verified. Backfill candidate. The schema is open per ADR-0004 (tool-agnostic).
- **`verified_on`:** `"YYYY-MM-DD"` | `null`. Date the URL was last confirmed to load and match the cited claim. Per the verified-on cadence rule below, anything > 90 days old is **stale** and should re-verify before next cite.
- **`archive_url`:** optional. archive.org / archive.today / Wayback snapshot. Recommended for Tier-1 / Tier-2 sources that may move.

The file is a flat array of these objects. No nesting. Sort by `id` lexically so diffs are review-friendly.

---

## Verified-on cadence

| Tier | Re-verify cadence |
|---|---|
| 1 (regulator) | quarterly; government URLs are mostly stable but reorganise |
| 2 (issuer) | quarterly; issuer redesigns move PR pages |
| 3 (IGO) | every 6 months; DOIs and IMF/BIS papers are persistent |
| 4 (journalism) | per-cite; outlet paywalls and link rot are common |
| 5 (specialist) | per-cite; small outlets disappear |

In practice the floor is **90 days** for any cited entry. If a finding-packet citation references an entry whose `verified_on` is > 90 days old, the agent surfaces a stale-reference warning under Rule 5 and triangulates fresh before emitting.

---

## How to assemble a corpus

The order of operations for a new specimen:

1. **Threat sketch first** (Rule 2). Don't pull sources before the assets / actors / vectors / controls / gaps are sketched. Otherwise every interesting URL feels load-bearing and the corpus bloats.
2. **Parallel Exa fan-out.** Dispatch 6–10 queries in one tool-use block, each targeting a distinct facet. Take the top-N results from each.
3. **Tier-tag every result.** Classify into 1–5 per the canon. Discard results that don't speak to a sketched asset / vector / control / gap.
4. **Geographic balance pass.** For multi-jurisdictional specimens, run one extra query per jurisdiction not already represented in Tier 1.
5. **Triangulate the load-bearing claims.** Pick the 5–10 claims that drive findings. Verify each against ≥ 2 independent Tier-1/2/3 sources.
6. **Surface enrichments.** Cross-references often surface details that strengthen the corpus (the JV behind a licensed entity, an audit firm name dropped by a single outlet, an authorship trio). Fold them in if they add information; flag them if they contradict.
7. **Write to `references.json`.** Each entry gets `verified_via: "exa"`, `verified_on: today`. Discard the entries that didn't pass triangulation.
8. **Cite by `id`, not URL.** In the specimen markdown, refer to entries by `ref:my-source-id` (or in-line URL with the id appended in HTML comment). Future re-cite is one place to update.

---

## Worked example: the stablecoin specimen 8-claim cross-reference (2026-05-07)

This section is the proof that the methodology produces real, reviewable artefacts. The 8 highest-leverage claims in `specimens/stablecoin.md` v1.1 were each Exa-verified on 2026-05-07. Queries are exact; readers can paste them into Exa and reproduce the result.

### Claim 1: GENIUS Act became Public Law 119-27, signed 2025-07-18

> Query: `"GENIUS Act" "Public Law 119-27" signed Trump July 2025 stablecoin`
> Independent confirmations:
> - govinfo.gov primary text (`https://www.govinfo.gov/link/plaw/119/public/27`)
> - White House fact sheet (`whitehouse.gov/fact-sheets/2025/07/...`)
> - Debevoise & Plimpton legal alert
> - Greenberg Traurig client advisory
> - Mayer Brown legal update
>
> Senate vote 68-30 (2025-06-17), House vote 308-122 (2025-07-17), signed 2025-07-18. Confirmed across all five.

### Claim 2: HKMA first stablecoin licences (2026-04-10) to Anchorpoint Financial + HSBC

> Query: `HKMA first stablecoin issuer licences Anchorpoint HSBC April 10 2026 confirmation`
> Independent confirmations:
> - HKMA primary press release (2026-04-10)
> - Standard Chartered own press release (2026-04-10)
> - Coindesk
> - The Standard HK
> - The Paypers
>
> **Enrichment surfaced:** Anchorpoint = JV of Standard Chartered (HK) + HKT + Animoca Brands; licence numbers FRS01 and FRS02; HKMA processed 36 applications.

### Claim 3: Tether engaged KPMG + PwC (March 2026)

> Query: `Tether KPMG full audit financial statement March 2026 Ardoino McWilliams CFO`
> Independent confirmations:
> - tether.io primary press release (2026-03-24)
> - Coindesk (2026-03-27, citing FT)
> - Fortune (2026-03-24)
> - Ledger Insights
>
> **Enrichment surfaced:** Tether reportedly raising $15–20B at $500B valuation; CFO Simon McWilliams appointed early 2025 specifically for Big-Four readiness.

### Claim 4: IMF WP 2026/056 / BIS WP 1340 (Aldasoro/Beltrán/Grinberg), 40 bps parity deviation

> Query: `"Aldasoro" "Beltran" stablecoin parity deviation 40 basis points FX spillovers IMF working paper 2026`
> Independent confirmations:
> - IMF primary (`imf.org/en/publications/wp/issues/2026/03/27/...`)
> - BIS primary (`bis.org/publ/work1340.htm`)
> - IMF source PDF (`wpiea2026056-source-pdf.pdf`)
> - IDEAS RePEc cross-listing
> - FinRiskAlert summary
>
> DOI confirmed: `10.5089/9798229042246.001`. 69 pages. Authorised for distribution by Tobias Adrian.

### Claim 5: Circle IPO June 2025: $31.00, 19.9M primary, $583M net

> Query: `Circle IPO June 2025 NYSE 19.9 million shares $31 underwriters JPMorgan Citigroup`
> Independent confirmations:
> - Circle press release (2025-06-04)
> - SEC R8.htm primary (Circle 10-Q)
> - CNBC (2025-06-05)
> - Renaissance Capital
> - Fortune
>
> Note: 19.9M is the **primary** issuance net of over-allotment; total IPO including selling stockholders was 34M shares for $1.1B. The specimen's $583M net-proceeds figure refers to Circle's primary.

### Claim 6: ESMA register: 19 EMT issuers, 29 EMTs, 0 ARTs (March 2026)

> Query: `"ESMA" MiCA "19 issuers" OR "29 stablecoins" OR "29 EMTs" zero ARTs March 2026`
> Independent confirmations:
> - BeInCrypto (origin)
> - cryptonews.net
> - AInvest
> - FinanceIQ Hub
>
> All four trace to the same ESMA register update. Patrick Hansen of Circle's quote is the on-the-record primary attribution.

### Claim 7: BIS Papers No 170 (2026-05-05), three-EM-scenarios

> Query: `"BIS Papers" 170 stablecoins "international monetary" May 2026 dollarisation`
> Independent confirmations:
> - BIS primary (`bis.org/publ/bppdf/bispap170.htm`)
> - BIS Annual Economic Report 2025 cross-reference
> - BIS WP 1270 companion
>
> **Enrichment surfaced:** authors are **Aldasoro / Frost / Ito**, 33 pages, "approximately 98%" USD-denominated (BIS Annual Report says >99%; both within rounding).

### Claim 8: JPYC: Japan's first regulated yen stablecoin

> Query: `JPYC Inc fund transfer service provider licence August 2025 first regulated yen stablecoin launched October`
> Independent confirmations:
> - FINOLAB primary (2025-08-19)
> - FintechObserver (2025-10-24)
> - CoinMarketCap (2025-10-27)
> - BeInCrypto (2025-10-27)
> - CoinJournal (2025-10-27)
>
> **Enrichment surfaced:** registration number Kanto Local Finance Bureau Director No. 00099; runs on Avalanche, Ethereum, Polygon; targets 10 trillion yen ($67B) within 3 years; transaction limit 1M yen / transfer.

### Result

8/8 confirmed. 0 false. 5 enrichments folded into specimen v1.2.

---

## How a third party reproduces this

1. Open Exa (or equivalent web-search MCP).
2. Run each of the 8 queries verbatim.
3. Compare returned URLs to the confirmations above. Mismatch = either the methodology is wrong (regression) or the source rotated (re-verify needed).
4. Open `references.json`. Confirm each claim's supporting entries carry `verified_via: "exa"` and `verified_on: "2026-05-07"`.
5. Spot-check three random Tier-4 entries. Confirm the URL still loads.

If steps 1–5 succeed, the specimen's evidence layer holds. If any step fails, file an issue against the standard.

---

## When to extend this methodology

Not every cite needs Exa cross-reference. The threshold is:

- **Required**: any load-bearing claim in a finding packet, threat sketch, or run closeout.
- **Required**: every entry in the standards spine (long-lived URLs, periodic re-verify).
- **Optional**: Tier 4/5 colour citations that don't change the conclusion if removed.
- **Skip**: in-line code references, GitHub repository URLs in the same project, internal cross-links.

The methodology is the spine of the evidence layer, not a tax on every paragraph.

---

*Methodology v1.0. 2026-05-07. MIT licensed alongside the standard. The Exa-driven approach is the default; equivalent web-search MCPs are accepted under the same triangulation rules. If you adopt this in your own toolkit, the only requirement is that the discipline survives intact. The brand of the search tool does not.*
