Engineering

Why SDK drift happens (and why it's getting worse)

March 28, 20266 min read

The problem isn't that teams don't care about their SDKs. Most platform engineers understand that generated client libraries are critical infrastructure. The problem is that the systems we use to maintain them make drift the path of least resistance. Generated SDKs live in a separate repo from the API spec. The connection between them is a human process — someone has to remember to regenerate after every API change. In practice, nobody does. Not because they don't care, but because the toolchain never made it automatic. The result is a slow, silent divergence between what your API does and what your clients think it does.

The generation gap

When SDK generation is a manual step, it drifts on the first busy week. We've seen the pattern play out dozens of times: an API change gets merged on Monday. A ticket to regenerate the SDKs is created on Tuesday. By Wednesday, the ticket is deprioritized behind a production incident. By the following sprint, nobody remembers it exists. Three weeks later, a staging environment starts throwing deserialization errors because the Python SDK still expects a field that was renamed in the API two releases ago.

The gap between “API changed” and “SDK updated” is where every client-server contract violation is born. It's not a gap that code review catches, because the API repo and the SDK repo are reviewed by different people at different times. By the time anyone notices the divergence, the damage is already in production — or worse, it's been silently absorbed by consumers who built workarounds that now depend on the broken behavior.

AI agents made it worse

Coding agents generate client code constantly now. They pull from whatever SDK is checked into the repo — which may be weeks or months out of date. The agent doesn't know. It has no way to know. It reads the type definitions, generates a perfectly valid API call against them, and the code compiles and passes type checks. The mismatch only surfaces at runtime, when the server returns a shape the client wasn't expecting.

The agents are doing exactly what they're supposed to — generating code from the available type information. The problem is that the type information is stale. The agent amplifies the drift because it generates more client code, faster, all of it based on an outdated contract. What used to be one developer making a mistake is now an agent making the same mistake across every integration point simultaneously.

The spec is not the source of truth anymore

Most teams think their OpenAPI spec is authoritative. It isn't — the deployed service is. The spec is aspirational. It describes how the API is supposed to behave, but nobody enforces that it actually does. When the spec and the deployed service diverge, the spec becomes documentation of how things used to work. Every team we've talked to has at least one endpoint where the spec says one thing and the production service does another.

This happens because specs are written by humans at one point in time, and services evolve continuously through pull requests, hotfixes, and migrations. Without a system that compares the spec against deployed reality on every change, divergence is structurally guaranteed. The spec decays into a historical artifact rather than a live contract. Your consumers trust it, your agents generate from it, and nobody knows it's wrong until something breaks.

Detection is harder than it looks

Most spec-diff tools compare structural shapes and flag what changed. But breaking-change detection requires evaluating what the change means for existing consumers. A field removal is breaking. An optional field removal is a warning. A response field made required is breaking for some consumers and not others, depending on whether they check for it. Structural diff doesn't capture this.

We built a 34-scenario ground-truth corpus covering every breaking change kind and tested both StateAnchor's gate engine and api-smart-diff, the most widely deployed open-source spec-diff tool. StateAnchor detected 100% of breaking changes. api-smart-diffdetected 65% — missing 35% of the scenarios that would break a real consumer. A 35% miss rate on breaking changes is not a rounding error. It means one in three breaking changes reaches your consumers undetected. Full comparison →

How StateAnchor fixes this

StateAnchor takes a different approach. Your stateanchor.yamlfile lives in your Git repo as the desired state of your API. It is the single authoritative declaration. Every push to your repo triggers a pipeline that regenerates all artifacts — TypeScript, Python, and Go SDKs plus an MCP server — directly from that spec. There is no manual step. There is no ticket to create. The generation happens on the same commit as the change.

The gate engine sits between your spec change and the generated output. It diffs the new spec against the previous version, classifies every change as ERR (always blocks), WARN (blocks above threshold), or INFO (always passes), and makes a binary decision: proceed or block. If a breaking change is detected, the pipeline stops before any artifact is generated. Drift is structurally impossible because the generation is automatic, the validation is deterministic, and the spec is the only input. There is no gap for drift to enter.

SDK drift isn't a discipline problem. It's an architecture problem. When regeneration is a manual step, drift is inevitable — not because your team is careless, but because manual processes fail under pressure. When regeneration is automatic and gated, drift is impossible. The question isn't whether your SDKs will drift. It's whether you'll notice before your consumers do.

Connect your repo in ~=5 minutes. The gate runs on every push. Free to start.