Trusted-Surface Rebind
When the PR changes the verifier itself, the verifier reruns from the BASE ref. The critic that judges your PR can't be the critic your PR modified.
The problem
Self-modifying critics are an attack surface.
The naive critic setup reads its config from the working tree. If a PR's diff includes a change to that config — say, lowering the severity of a finding it's about to produce, or rewriting the prompt that judges security-relevant code — the critic happily reads the modified config and approves the PR.
This isn't theoretical. It's the standard attack surface for any verifier that treats its own configuration as data. The fix isn't "trust nothing" — it's bind the verifier to the BASE ref's config when the PR diff touches it.
How it works
Detect verifier-surface changes; rebind to BASE before running.
Trusted-Surface Rebind declares a list of paths in spec.yaml that constitute
the verifier surface — critic configs, prompts, aggregation policies, severity calibration,
drift-detector source. Before each critic run, the orchestrator inspects the PR diff against
that list.
If any path matches, the run rebinds: critic config + prompts + aggregation are loaded
from the BASE ref, not the PR head. The decision is logged as REBOUND
with the specific paths that triggered it. A passthrough decision is logged when no trusted-surface
paths are touched. Either way, the customer can audit why each critic ran what it ran.
- Diff inspection at the start of every critic run — verifier paths declared in spec.yaml
- If any path under the trusted surface is in the diff, critic config + prompts come from BASE
- Per-commit decision is logged: REBOUND or PASSTHROUGH with the exact reason
- Defends against both intentional and accidental policy edits inside the PR under review
- Generalized from our own production use of Dark Factory — works on any DF-installed repo
trusted_surface:
paths:
- tools/agent-review/**
- .agent-review/config.json
- scripts/ci/audit_branch_protection.py
- scripts/ci/validate_cycle_doc.py
- tools/branch-protection/spec.yaml
- darkfactory.yaml
rebind_policy:
on_diff_touches: load-from-base
log_decision: true
include_in_evidence: true
# Decision log entry (auto-emitted):
# {
# "sha": "5d8e1a3",
# "decision": "REBOUND",
# "triggered_by": ["tools/agent-review/src/policy.ts"],
# "base_ref": "origin/main",
# "rebound_at": "2026-05-17T18:42:11Z"
# } Get Started
Don't let a PR edit the rules that judge it.
A class of supply-chain-style attacks on the gate itself, closed by construction. On for every Dark Factory install.