Receipts, Not Prompts
AI saturation is real, but legitimacy saturates first
Konrad Kording and Ioana Marinescu’s Brookings working paper, “(Artificial) Intelligence Saturation and the Future of Work,” does something rare: it forces the AI-doom and AI-boom camps to stop arguing about vibes and start arguing about parameters.
Their core move is clean:
Split the economy into Physical production (P) and Intelligence production (I).
Let AI scale at computer-science speeds while labor and physical capital move at economic speeds.
Use a nested CES structure so “intelligence saturation” falls out mechanically when P and I are complements.
If P and I are complements, then as AI explodes you eventually hit a ceiling: returns to more I saturate because P becomes the bottleneck. That’s their headline.
And crucially, they don’t just say “wages go down” or “wages go up.” They show wages can be non-monotonic: up early (scale effects), down later (reallocation into physical tasks), depending on substitution elasticities and how automation shifts employment shares.
This is a serious paper.
My claim is that it is also, in one key way, still too economics-clean for the world we actually inhabit.
Thesis (up front): even before the economy bottlenecks on P, it bottlenecks on G: governance, admissibility, liability, appeal, audit. AI makes persuasion abundant. It makes institutional permission to rely on outputs scarce.
So: I’ll steelman the paper, give seven clean cuts, then offer the alternative frame: G as the binding constraint, with falsifiable predictions.
Steelman first: what they get exactly right
1) “AI is just capital” is underspecified
They take on the standard macro move: treat AI like another capital input and you get slow, smooth change. Scaling-law intuition says “no, it’s explosive.” Their model nests both by separating sectors and letting AI scale asymmetrically.
2) The physical world is not tokenizable (at least not quickly)
They anchor the intuition in robotics and “effectors”: intelligence helps you use effectors efficiently, but effectors ultimately constrain throughput. That’s right. The economy has lots of hands-on bottlenecks: logistics, safety, embodied care, in-person differentiated services, hard physical capacity. Even many “cognitive” jobs are physical in their sense (surgery, classroom teaching, courtroom advocacy).
3) The model is unusually transparent
They put the whole machine on the table: nested CES blocks, and an explicit definition of “AI capital” as effective compute (hardware × utilization × algorithmic-efficiency). You can disagree with parameters without hand-waving.
4) The wage ambiguity is real, and the decomposition is useful
Their wage effects decompose into: (A) scale, (B) physical output term, (C) reallocation pressure. That’s the right anatomy: you can’t talk about wages without asking where labor goes as tasks automate.
So far: receipts.
Now the cuts.
The seven clean cuts
Cut 1: “AI is abundant” is doing hidden work
Their intelligence block relies on a regime where AI capital is “sufficiently abundant” relative to labor for not-yet-automated tasks. Fine as a limiting case, but it quietly assumes:
AI is cheap enough everywhere it matters,
inference is available at scale,
integration doesn’t throttle deployment.
In reality, abundance is local, not global: abundant in customer support, not abundant in regulated claims; abundant in marketing copy, not abundant in air-traffic control; abundant in prototypes, not abundant in production.
Prediction if this matters: “AI everywhere” in surface area, but not in the core throughput of high-liability systems.
Cut 2: CES smoothness hides lumpy reality
CES is the economist’s velvet hammer: elegant, tractable, too smooth.
The real economy is full of thresholds:
one missing approval blocks a workflow,
one ambiguous decision triggers a dispute spiral,
one integration failure prevents scaling.
CES can represent complements and substitutes. It does not naturally represent admissibility gates.
If the world is “Receipts, Not Prompts,” the production function needs a term for:
can this decision be defended, insured, appealed, audited, and repeated without unbounded tail risk?
CES doesn’t feel that cliff.
Cut 3: They model saturation as physical complementarity, but the binding constraint is often institutional
Sometimes effectors are the bottleneck. But in many high-value settings the binding constraint is not a robot arm. It’s a lawyer, regulator, auditor, safety engineer, or insurer.
In other words: the bottleneck is often not P broadly. It’s G: governance, admissibility, compliance, liability containment.
This isn’t philosophy. It changes adoption curves.
Cut 4: Compute-to-capability diminishing returns is not the main diminishing return
They include a diminishing-returns term for intelligence production. Good.
But the larger diminishing return in practice is often:
model capability → deployable impact.
That drop comes from data rights, privacy constraints, process redesign, tooling, human override, organizational politics, audit and appeal machinery.
Those are not physical effectors. They’re institutional effectors.
Cut 5: Wage = marginal product is clean; wage-setting is not
They note sectoral markdowns can matter. That footnote is a grenade with the pin still in.
If the intelligence sector has different employer power, contestability, credential moats, or winner-take-most dynamics, then “wages follow MP” becomes a weak guide. “AI crowding” can show up as lower bargaining power, credential inflation, fewer career rungs, and hollowed ladders.
Cut 6: Sector boundaries are endogenous and contested
Their P/I boundary is sensible, but it is not fixed. Firms actively move tasks across it:
converting services into digital substitutes,
standardizing exceptions away,
changing what counts as “good enough,”
pushing customers into self-serve flows.
So substitutability between P and I is not just a parameter. It’s a battlefield.
Cut 7: The transition is the story, and the model is too smooth about it
They simulate dynamics, but the core story still reads like: pick parameters, compare equilibria, wages move smoothly.
That’s not how general-purpose technology transitions actually feel.
The transition is dominated by frictions that aren’t “capital accumulation” frictions. They’re coordination and legitimacy frictions:
Who is liable when the model is wrong?
What is the appeal path?
What counts as sufficient evidence?
Who can override?
What gets logged?
Who signs their name?
These don’t show up as a mild parameter shift. They show up as adoption cliffs: sudden freezes, “AI everywhere” pilots, and then a long stretch of “AI nowhere that matters.”
Even if the long-run equilibrium has intelligence saturation, the medium run is governed by admissibility saturation. That’s where most distributional pain, and most upside, will be decided.
The alternative frame: not P vs I, but
Action vs Legitimacy
The paper’s headline is: intelligence saturates because physical complementarity binds. Fair.
But there’s an orthogonal scarcity that arrives earlier and bites harder:
AI makes output abundant, and institutions allergic.
Reframe the variables:
I is not “intelligence.” It’s generated reasons (answers, plans, diagnoses, drafts, justifications).
P is not “physical.” It’s execution capacity (things that happen in the world).
G is the missing state variable: governance/admissibility capacity (the ability to rely on outputs without unbounded tail risk).
Key causal claim:
As AI scales, I grows faster than G, and the economy bottlenecks on G before it bottlenecks on P.
Put differently: you don’t hit the ceiling because there aren’t enough robot arms. You hit it because there aren’t enough receipts.
One concrete vignette (what G feels like)
In a low-liability domain, “an explanation” is enough. In a high-liability domain, an explanation is a lawsuit-shaped object.
If an AI system denies a claim, flags a transaction, rejects a loan, downgrades a patient, or fires an employee, the question is not “was it plausible?” It is “can you produce a defensible chain of evidence and a repeatable appeal path?” The adoption bottleneck becomes the throughput of dispute resolution and liability containment, not the throughput of reasoning.
That’s G.
What is G, concretely?
Not “regulation” as a vibe. A stack of operational capabilities:
Evidence discipline
Decision traceability
Cheap, consistent appeals
Liability routing
Enforcement that bites
Override governance
Monitoring and incident response (drift, abuse, adversarial pressure, postmortems)
This is why “AI abundance” is local: you can have model capability and still have zero deployable automation where G is missing.
How this changes the wage story
Their wage ambiguity is real. But G changes the timing and the distribution.
A third destination for labor reallocation: not I → P, but I → G (audit, compliance engineering, safety operations, dispute handling, exception processing).
Wages become path-dependent and institution-specific: the same model produces very different outcomes depending on admissibility rails and jurisdiction.
The distributional conflict concentrates in exceptions: overrides and appeals become the real choke points. The political economy moves there.
If you want a true “future of work” question, it’s not “how many tasks automate?” It’s:
who controls the override layer?
Falsifiable predictions: how to tell which story is right
If the Brookings P-bottleneck story dominates, we should see:
declining employment share in intelligence tasks and big reallocation into physical work,
saturation patterns tightly linked to robotics and physical capital buildout,
adoption mainly explained by P/I complementarity.
If the G-bottleneck story dominates, we should see:
adoption cliffs in high-liability domains even with strong model capability,
explosive growth in governance tooling and governance labor,
wide cross-institution variance: same model, radically different deployment,
decisive constraints being appeals throughput and liability containment, not robotics.
Clean discriminator:
Over the next months and years, the sectors where AI changes the economy most won’t be the sectors with the best models.
They’ll be the sectors that build the cheapest admissibility rails.
Honest synthesis
In the long run, yes: physical complementarity can cap pure intelligence accumulation. You can’t build a car with pure intelligence.
In the medium run, the economy bottlenecks on G: legitimacy, auditability, appeals, liability.
And G is not epiphenomenal: building G changes effective complementarity by enabling standardization, modularity, and safe delegation, which then accelerates physical buildout.
So the hierarchy is:
First bottleneck: G (permission to rely on outputs)
Second bottleneck: P (execution capacity)
Third bottleneck: I (only after governance and buildout catch up)
Closing
Kording and Marinescu are right to force the debate into parameters. And they’re right that complementarity implies saturation in the limit.
But their clean two-sector framing hides the part of the transition that will decide actual outcomes: the race between reason generation and legitimacy production.
AI makes prompts plentiful.
The economy runs on receipts.
And until we can scale receipts, we won’t discover whether “intelligence saturation” is the ceiling, because we’ll stall far below it.

