Reference
10 min read
How StateAnchor works
StateAnchor is a control plane for API evolution. It reads a desired-state declaration from Git, compares it against the last known state and optionally the live deployed API, enforces policy through a categorical gate engine, and derives all downstream artifacts (SDKs, MCP servers, docs) from a single canonical intermediate representation. This page describes the architecture end to end.
The three planes
The system is organized into three logical planes. Each has a distinct responsibility and a clear boundary with the others.
1. Desired-state plane
The stateanchor.yaml file in your repository is the single source of truth. It declares what your API should look like: endpoints, models, auth configuration, output languages, and gate policy. This file is version-controlled, code-reviewed, and commit-addressable. Every change to it is a deliberate, auditable act.
The desired-state plane answers: “What did we intend?”
2. Observation plane
On every push, StateAnchor observes the current state by comparing the incoming spec against the previous Intermediate Representation (IR). The diff engine identifies what changed: endpoints added, fields removed, types modified, auth altered. The gate engine classifies each change into ERR, WARN, or INFO lanes and decides whether the push should proceed or be blocked.
Optionally, the Live API Scanner probes the deployed API to detect actual-state drift— cases where the deployed behavior has diverged from the declared spec due to hotfixes, manual changes, or configuration drift that never made it back to the repo.
The observation plane answers: “What actually changed, and is the change safe?”
3. Derivation plane
When the gate allows the push to proceed, StateAnchor derives all downstream artifacts from the canonical IR: TypeScript SDK, Python SDK, Go SDK, MCP server, API documentation. Each artifact is content-addressed with SHA-256, validated against structural and policy rules, and stored immutably. The derivation is deterministic relative to the IR — the same IR always produces the same artifact hashes (excluding generation non-determinism from the LLM, which is why we store artifacts by content hash rather than by spec version).
The derivation plane answers: “What should consumers use?”
The sync pipeline
Every sync run executes a 6-stage pipeline. Each stage is recorded in the sync run's log with timestamps and status, creating a full audit trail.
Stage A — YAML validation
The stateanchor.yamlfile is parsed and validated against the schema. This catches structural errors (missing required fields, invalid types, malformed YAML) before any processing begins. YAML anchors and aliases are rejected for security. If validation fails, the sync run is marked as failed immediately — no further stages execute.
Stage B — IR generation and gate evaluation
The validated spec is compiled into a canonical Intermediate Representation (IR). The IR is a normalized, language-agnostic data structure that represents every endpoint, model, parameter, and configuration in a consistent format. This is the data structure from which everything else is derived.
If a previous IR exists for this project, the diff engine compares the two and produces a list offindings: structural changes classified by type and severity.
The gate engine evaluates the findings against the project's policy:
- ERR lane — always blocks. Endpoint removal, field removal, type changes, auth changes, enum value removal.
- WARN lane — blocks when count meets threshold (default: 1). Required field added, field renamed.
- INFO lane — never blocks. Endpoint added, optional field added, description changed.
- PREDICTIVE_WARN — advisory overlay. Fires when drift velocity projects a threshold crossing within N commits.
If the gate decision is block, the pipeline halts. No code is generated. No artifacts are stored. The sync run is recorded as “blocked” with the full finding list and reason.
Stage C — Code generation
For each output language configured in the spec, the IR is passed to Claude (Sonnet 4.6) with a language-specific generation prompt. The prompt includes the full IR, a locked transport skeleton (for TypeScript), and instructions for the target language's conventions.
Generation is parallelized across languages using a concurrency limiter. Each generation call has a 240-second timeout with exponential backoff on rate limits.
Stage D — Structural validation
Every generated artifact is validated before storage:
- TypeScript/MCP — compiled with ts-morph in memory. Type errors, missing imports, and syntax errors are caught.
- Python — validated with mypy if available.
- Go — validated with
go vetif available. - MCP servers — additionally validated against Anthropic's Software Directory Policy (tool name length, annotations, error handling, auth patterns).
If validation fails, the artifact enters a single bounded retry with failure feedback injected into the generation prompt. If the retry also fails, the artifact is not stored and the failure is recorded in the sync run's context snapshot.
Stage E — Content-addressed storage
Each validated artifact is SHA-256 hashed and stored in the forge_objectstable, keyed by its content hash. If an identical artifact already exists (same hash), no duplicate is created — the system is naturally deduplicated.
A project_ref pointer is updated to point to the new artifact hash for each language/output type. The previous pointer value is recorded in the artifact_reflog, creating an append-only audit trail of every artifact change.
The artifacts are also stored in the artifacts table with the sync run ID, making them retrievable from the project detail page and the public API.
System overview diagram
How the external components connect to the sync pipeline:
+-------------------------------------------------------------+
| YOUR REPO |
| stateanchor.yaml <--- single source of truth |
| .github/workflows/stateanchor-sync.yml |
+--------------+----------------------------------------------+
| git push
v
+----------------------+ OIDC token +------------------+
| GitHub Webhook | ---------------> | StateAnchor API |
| (push event) | | (Vercel) |
+----------------------+ +--------+---------+
| enqueue
v
+------------------+
| Trigger.dev |
| (durable job) |
+--------+---------+
|
+------------------------------+---------------+
| SYNC PIPELINE | |
| v |
| +-----+ +-----+ +-----+ +-----+ +---+ |
| | A |→ | B |→ | C |→ | D |→ | E | |
| |YAML | |IR + | |Code | | Val | |SHA| |
| |Parse| |Gate | | Gen | |idate| |256| |
| +-----+ +-----+ +--+--+ +-----+ +-+-+ |
| | | |
+-----------------------+-----------------+---+
| |
+--------------+ |
v v
+------------------+ +------------------+
| Claude Sonnet | | Supabase |
| (generation) | | (Postgres) |
+------------------+ | . forge_objects |
| . project_refs |
| . sync_runs |
| . artifacts |
+--------+---------+
|
v
+------------------+
| Dashboard |
| (Vercel) |
| . project detail|
| . sync history |
| . artifact DL |
+------------------+Pipeline data flow
What data flows between each stage of the sync pipeline:
stateanchor.yaml (Git repo)
|
v
[Stage A] YAML Parser ---> validated config object
|
v
[Stage B] IR Compiler ---> canonical IR (endpoints, models, auth)
|
+---> Spec Diff Engine ---> findings (what changed)
| |
| v
| Gate Engine ---> block / proceed decision
|
v (only if gate proceeds)
[Stage C] Claude Sonnet 4.6 ---> generated code per language
|
v
[Stage D] Compiler Validation ---> pass / fail per artifact
| MCP Policy Check ---> ERR / WARN findings
|
v
[Stage E] SHA-256 Hash ---> forge_objects (content store)
| ---> project_refs (pointer update)
| ---> artifact_reflog (audit entry)
| ---> artifacts (sync run linkage)
|
v
Dashboard + API ---> downloadable artifacts, provenance, audit trailComponent responsibilities
| Component | What it does |
|---|---|
| GitHub | Stores the spec file, triggers webhooks on push, provides OIDC tokens for Action authentication. |
| Trigger.dev | Executes the sync pipeline as a durable background job with retry, timeout, and deduplication. |
| Claude (Sonnet 4.6) | Generates SDK code, MCP servers, and documentation from the IR. Does not make gate decisions. |
| Supabase (Postgres) | Stores projects, sync runs, artifacts, gate decisions, drift exceptions, audit logs, and user data. |
| Vercel | Serves the dashboard, API routes, webhook handlers, and the docs site. |
| Clerk | Handles user authentication, session management, and GitHub OAuth for repo connection. |
What StateAnchor never touches
- Your source code. StateAnchor reads only
stateanchor.yaml. It never clones, reads, or indexes your codebase. - Your secrets. API keys and environment variables are never stored by StateAnchor. GitHub OIDC tokens are validated and discarded immediately.
- Your runtime traffic. StateAnchor does not proxy, intercept, or observe API traffic. The Live API Scanner makes explicit HTTP requests to declared endpoints only when manually triggered.
- Your deployment pipeline. StateAnchor evaluates and generates. It does not deploy, publish, or push artifacts to registries. You decide what to do with the generated output.
Content-addressed storage model
StateAnchor stores every generated artifact by its SHA-256 content hash. This model provides three properties:
- Deduplication. Identical artifacts share a single storage entry regardless of how many times they are generated.
- Integrity verification. Any artifact can be verified by recomputing its hash. If the hash does not match, the artifact has been tampered with.
- Safe rollback. Rolling back to a previous version is a pointer move, not a data operation. The
project_refpointer is updated to point to the previous artifact hash. The old artifact is never deleted — only the pointer changes.
Storage tables
| Table | Purpose | Key |
|---|---|---|
forge_objects | Content-addressed blob store. Holds the actual artifact content. | SHA-256 hash (primary key) |
project_refs | Mutable pointers from project + language to the current artifact hash. | project_id + language |
artifact_reflog | Append-only log of every pointer change. Records prev_sha, new_sha, spec_sha, reason. | Auto-increment ID |
artifacts | Links artifacts to sync runs for dashboard display and API access. | Auto-generated UUID |
When you view a project's artifacts in the dashboard, you are reading from the artifactstable joined to the latest sync run. When the system checks whether an artifact needs regeneration, it compares the IR hash against the cached generation key — if the IR has not changed, the previous artifact is reused without calling Claude.