How the Gate Works

StateAnchor's gate engine evaluates every spec change against four independent diff anchors, classifies each change into a categorical lane (ERR / WARN / INFO), and produces an unambiguous verdict: block or proceed. This page explains the architecture end to end.

The gate runs on every push

When a push modifies stateanchor.yaml, StateAnchor fires automatically via the GitHub Action or GitHub App webhook. The pipeline is:

Parse and validate the incoming stateanchor.yaml
Compile to a canonical Intermediate Representation (IR)
Run four independent syndrome diffs against four reference points
Classify each finding into ERR, WARN, or INFO
Evaluate the gate policy and return a verdict
If passing: regenerate artifacts. If blocking: hold at last good state.

The entire pipeline is a durable Trigger.dev job -- it survives network failures, timeouts, and Vercel cold starts. If StateAnchor is unreachable, the Action behavior is controlled by outage-policy (default: fail-closed -- the push is blocked). Set outage-policy: fail-open to allow pushes to proceed during outages.

Four independent syndromes

A single diff comparison has one failure mode: if the reference state is itself already broken, the delta is zero and the gate sees nothing. StateAnchor runs four independent syndrome comparisons in parallel, each anchored to a different reference point.

Syndrome	Reference point	Catches
`parent_diff`	Immediate parent commit	Changes introduced in this specific commit
`merge_base_diff`	Common ancestor with `main`	Accumulated changes in the entire branch, not just the latest push
`last_good_diff`	Last sync that passed the gate	Gradual drift across many passing syncs -- the boiling-frog problem
`deployed_diff`	Spec at last production deployment	Divergence between what is in Git and what is actually running

ERR fires if any syndrome produces a breaking change. The gate is conservative: a change that looks safe against the parent can still trigger ERR if the LKG or deployed diff reveals accumulated contract drift.

This is the Swiss cheese barrier model from safety-critical systems engineering (STPA): four independent barriers with uncorrelated failure modes. For API drift to reach production undetected, holes in all four barriers have to align simultaneously.

See Syndrome independence for the formal model and worked examples.

ERR / WARN / INFO: categorical, not scored

Each finding is classified into exactly one lane. The lane is the decision signal -- not a score, not a threshold, not a percentage.

Lane	Verdict	Design intent
ERR	Always blocks	Unambiguous breaking changes -- a consumer will break. No threshold to configure. No score to game.
WARN	Blocks above threshold	Changes that may be intentional but introduce risk. Gate policy controls the count threshold (default: 1 WARN blocks).
INFO	Always passes	Additive or cosmetic changes -- safe for consumers. Logged for visibility, never for enforcement.

Why categorical and not scored? Scores require threshold decisions -- and thresholds are tuned under pressure until they stop blocking anything meaningful. A categorical gate with no knobs preserves the signal. The 0-100 composite score shown in the dashboard is display-only; it never drives the gate verdict. See Gate engine for the full policy model.

33 change kinds

The gate recognizes 33 distinct change kinds, distributed across the three lanes. The distribution is deliberately uneven: the ERR set is the largest because breaking changes are the most varied.

Lane	Count	Examples
ERR	16	endpoint_removed, field_removed, type_changed, auth_changed, required_param_added, response_schema_type_changed
WARN	9	optional_field_removed, response_field_required, response_constraints_relaxed, deprecation_violation
INFO	8	endpoint_added, field_added_optional, deprecated_flag_added, constraints_relaxed

The kind count is an invariant locked in CI (tests D1-D4 in tests/integration/kind-registry.test.js). New kinds cannot be silently added. Full reference with before/after YAML examples for every kind: Gate kinds (33 total).

Detection accuracy

StateAnchor's spec-diff engine was benchmarked against a 34-scenario ground-truth corpus covering every ERR and WARN kind. Results:

Engine	Detection rate	Notes
StateAnchor spec-diff	100%	34 / 34 scenarios detected
api-smart-diff	65%	22 / 34 scenarios detected

The corpus is in CI as a regression guard (tests/quality/detection-coverage.test.js). A future release cannot silently regress detection accuracy without a failing test. Full scenario-by-scenario breakdown: StateAnchor vs. diff tools.

Outage behavior

If StateAnchor is unreachable -- network partition, deployment failure, Vercel cold start -- the Action behavior is controlled by the outage-policy input. The default is fail-closed: the push is blocked. Set outage-policy: fail-open to allow CI to proceed during outages (ADR-004). With fail-open, only local YAML validation runs and the result is soft-pass.

The tradeoff is explicit: the default protects against unreviewed spec changes landing silently during an outage. Teams that prefer not to block deploys on StateAnchor availability can opt into fail-open. The Live API Scanner provides an external absolute reference to compensate for periods when the gate could not run remotely.

Next: Gate kinds (33 total) -- Syndrome independence -- Detection benchmark

Architecture

8 min read