The Autonomous AI-Native
Software Development Lifecycle.
Agents author, critic, validate, and ship. Every gate is deterministic. Every bypass is logged. Multi-vendor adversarial review, declarative branch-protection, per-SHA evidence — the SDLC built for codebases where most commits are AI-authored.
The pipeline
PRD to production —
engineered by agents, gated by code.
Humans set direction: write the PRD, refine the cycle spec, prompt the agents. Agents do the work: decompose into cycle specs, write code in TDD, run their own critics, validate in CI with adversarial multi-model review, deploy to preview, soak under load, and promote to production when every gate is green.
PRD
human-ledWhat to build, why, success criteria.
Cycle Spec
agentPlan agent decomposes PRD into cycle docs; critic reviews the plan PR.
Local Dev
creator + criticCreator writes TDD; local critic enforces evidence per commit.
CI Gates
multi-agentDeterministic invariants + adversarial cross-model critics.
Preview
auto-deployAuto-deployed environment per PR. Smoke + E2E.
Soak
eval + obsTime-bounded. Eval harness, traces, error rates.
Production
auto-promoteCanary → full rollout. Auto-rollback on regression.
The eight services
Composable. Open-spec.
Hosted runtime where it counts.
Dark Factory decomposes into eight services. The schemas and CLI ship open-source. The aggregation policy, critic prompts, and severity calibration are the hosted IP that makes the verdicts honest.
Critic Orchestrator
Four critic vendors run in parallel under min-complete-quorum aggregation. Multi-vendor adversarial review beats single-model bias.
Policy Engine
Severity thresholds, bypass classification, and reason taxonomy — declared in darkfactory.yaml, evaluated against every verdict.
Trusted-Surface Rebind
When the PR diff touches verifier code, the critic runs against BASE — closing the self-modifying-policy attack surface.
Per-SHA Evidence Store
Quality-gate verdicts keyed by commit SHA + diff hash. Every merge has a forensic trail. No cross-SHA leakage.
Cycle Doc Trailer Validator
Every commit carries Cycle: / Issue: / ProjectItem: trailers. Lifecycle state machine refuses code without a traceable plan.
Merge Queue Admission Policy
ALLGREEN + thread-resolution-required + zero human approvals. Plan-PR vs code-PR routing. AI-native default.
Branch-Protection Drift Detector
spec.yaml is the desired state. Live ruleset is reconciled on every PR + weekly cron. Drift fails the gate.
Audit + Compliance Trail
Structured bypass events with reasons, append-only NDJSON, SOC2-grade. Every override is an audited event, not a silent escape hatch.
The differentiated wedge
Three orthogonal moats.
None of them are copied by accident.
01 — Vendor portfolio
Multi-model adversarial review
Four critic vendors by default — Cursor SDK, Codex SDK, Gemini SDK, Grok direct. GitHub will default to OpenAI; Cursor will default to Anthropic. Critics that disagree are critics that catch things. No single vendor's review is sufficient against models from the same family.
02 — Policy as code
Declarative + drift-detected
darkfactory.yaml
is the gate config; spec.yaml
is the branch-protection desired state; the drift detector runs every PR. GitHub's tools
are imperative + GUI. Cursor's BugBot is app-configured. Neither is git diff-able.
03 — Provider neutrality
Sells into any model stack
Anthropic shops, Google shops, Mistral shops, open-weights shops, OpenAI shops — DF installs the same way. The provider-neutral posture is the wedge into orgs that have already committed to non-OpenAI primaries. That segment is growing.
04 — Trusted-surface rebind
Self-modifying policy can't sneak through
When a PR diff touches policy code, the critic runs against the BASE ref's policy. A PR cannot edit the rules that judge it. The trusted-surface rebind pattern closes a class of supply-chain-style attacks on the gate itself.
Install
One command, then every PR has a verdict.
The GitHub App handles the hosted-runtime case. The OSS CLI mirrors the local pre-push critic for power users, air-gapped deployments, and reproducible local-vs-CI verdicts. Same binary, same policy schema, same evidence store.
Get Started
Make your AI-authored code reviewable.
Multi-vendor critic portfolio. Deterministic merge gates. SOC2-grade audit trail. Install via the GitHub App or the OSS CLI.