barcik.training

Scenario Planning
for Generative AI

Eight currents. One habit. Your strategy.

Robert Barcik July 2026 robert@barcik.training

Introduction

The Question

After almost every training session I run, the same three questions arrive, in the same order. First, curious: where is this actually going? Then, quieter: will it take our jobs? And finally, from whoever owns a budget: so what should we bet on?

This booklet is my working answer to all three. It starts from a fact most of the debate skips: a large part of the future being asked about is already paid for. The four biggest hyperscalers have guided roughly $700 billion of capital expenditure for 2026 alone. That money is already flowing into data centers, GPU clusters, and training runs, and it will produce outcomes whether or not anyone is ready for them. You don’t have to guess what the labs believe about the future. You can read it off their balance sheets.

That reading skill is where we begin. The next section, the capex decoder, shows how every dollar of hyperscaler capex carries a date stamp: inference capacity arriving in roughly two years, training runs in three, research breakthroughs in four or more. Ten minutes with it and AI headlines start sorting themselves by which year’s money produced them.

After the decoder come eight currents: the forces actively moving the field over the next two to three years, from continued scaling and the efficiency race to sovereignty, displacement economics, and the politics of both power grids and job losses. Each current has its own data, its own thesis, and its own trigger signals to watch, and each closes with a dated trigger log recording which signals actually fired since the previous edition. The log is the part of this booklet that is supposed to change every time you open it.

A note on what this booklet is, and isn’t

Classic scenario planning builds a few internally consistent future worlds and asks how you’d fare in each. I do something deliberately different here: I track the forces those worlds would be built from, because in a field moving this fast the forces are more stable, and more observable, than any single future. The method is still scenario planning (foresee, watch triggers, adjust), applied to currents instead of finished scenarios. And I’m not going to tell you which future will play out; I don’t think that’s the right question. The right question is which signals you’d watch and how you’d respond. If a current goes stale, retire it. If a trigger fires, act. That stance is behind every page that follows. Where I do state what I personally expect, it gets its own scoreable page at the end: Where I’d Put My Chips.

The Method

The Capex Decoder
“Reading the future from money already spent”

How to read it

The $228B spent in 2024 funded training clusters for models shipping 2–3 years later, plus inference infrastructure for the next generation (GPT-4 itself already ran on 2023 hardware). The ~$700B guided for 2026 funds models whose architectures may not be designed yet. Every capex dollar carries a date stamp for when its capability arrives.

What it can’t tell you

Capex is a forecast made by people who can be wrong. It tells you what the labs expect to exist, which is different from what will. Efficiency leaps can strand the bet (Current 2), the funding can dry up before the payoff (Current 3), and a sovereign can sever the buyer from the seller (Current 4). The decoder dates the bets; the currents stress-test them.

When people debate whether AI will “live up to the hype,” they often miss a crucial fact: the investment decisions have already been made. The Big Four hyperscalers (Alphabet, Amazon, Meta, and Microsoft) spent a combined $228 billion on capital expenditure in 2024, up 62% from $140 billion in 2023. For 2025, guided spending reached $416 billion. For 2026, guidance points to roughly $700 billion: Microsoft around $190B, Amazon around $200B, Alphabet $175–185B, Meta $115–135B, with Oracle adding tens of billions on top. This money is flowing into GPU clusters, power infrastructure, and data centers, and the vast majority is earmarked specifically for AI.

The reason this is an instrument rather than just a large number is that capex doesn’t translate into capability instantly. Building a data center takes 12–24 months. Procuring the chips takes 6–12 months. Training a frontier model takes another 6–12 months. Post-training, safety testing, and deployment add 3–6 more. Each year’s spending splits into three bets running at different timescales: inference capacity arriving in roughly 2 years, training runs for models 3 years out, and research compute powering breakthroughs 4+ years away. Trace any year’s money forward and you get a dated map of the future the labs themselves are planning for. The timeline below is that map. Click any investment dot and follow its three arrows.

Hover or click any green investment dot to explore where the money goes

Inference capacity · ~2 yr

Training runs · ~3 yr

Research frontier · ~4+ yr

* Fable 5 (June 2026) is plotted without a parameter size: it reads as an early, safeguard-wrapped variant of the Mythos generation rather than a sizing of the 2026 cohort. The full generation (Mythos / Spud) stays on the 2027 mark; see the staircase note below. The dashed blue 2029 ring marks the author’s dated bet, argued with probabilities in Where I’d Put My Chips, that the generation financed by 2026–27 capex lands there. A bet, deliberately drawn on the chart, and open to being wrong.

The staircase pattern

Looking backward, AI capability has advanced in a staircase pattern: a major jump every roughly two years, followed by refinement within that generation. GPT-3 (2020) was dramatically surpassed by GPT-4 (2023), which was then refined through GPT-4 Turbo, GPT-4o, and eventually GPT-4o-mini, each iteration better and cheaper within the same capability tier. The same pattern appears with Claude 3 Opus giving way to Claude 3.5 Sonnet, then Claude Opus 4, refined through Opus 4.6 and 4.7. The next step, the Mythos generation, is due on this rhythm toward the end of 2026 and into 2027. What landed in June 2026 was an early variant of it; more on that below.

If this pattern holds, the $228 billion spent in 2024 is currently producing the training infrastructure for models that ship in 2026–2027. The $416 billion committed for 2025 funds models arriving in 2027–2028. And the ~$700 billion planned for 2026 is investing in capabilities whose architectures may not even be designed yet: research compute for ideas that haven’t been conceived. The clearest demonstration of this lag came on March 24, 2026, when OpenAI completed pre-training of GPT-6 (“Spud”) at the Abilene Stargate facility, a model whose existence was funded by 2024’s capex, roughly three years before its expected public launch. Anthropic’s next-generation “Mythos” sits in the same lane: limited testing with cybersecurity defenders through Q1 2026, and then, on June 9, 2026, a safeguard-wrapped variant shipped publicly as Claude Fable 5 (the unrestricted Mythos 5 stayed reserved for cyberdefenders), priced at $10 input / $50 output per million tokens. Capability paid for by money committed before its architecture had a name.

That June release is why Fable 5 carries an asterisk on the timeline above, plotted without a parameter count. Our readinginterpretation: the staircase thesis holds, with the full Mythos / Spud generation remaining on the 2027 mark exactly where the lag pipeline puts it, and Fable 5 looks like that generation released early, in a deliberately wrapped form. The shape of the release supports this. Instead of the final rounds of fine-tuning, the alignment and moderation work appears to ride on a software layer beside the model (conservative safeguards that route sensitive topics to Opus 4.8), and the timing, eight days after Anthropic’s S-1 filing, suggests the IPO calendar may have pulled the launch forward. Treat the early arrival as a preview of the step rather than the step itself: the trigger to watch is the full deployment.

Postscript, July 2026: three days after launch, a federal directive took both models dark worldwide; they returned gated, with Mythos flowing through Project Glasswing for large US organizations and Fable available via API tokens behind a strong moderation layer, at premium prices. The capability reading above is unchanged: the step previewed is real, and the Remote Labor Index measured it before the restrictions landed (Current 6). But the access story has become its own force. Current 4 picks it up, and the companion booklet gives it a full treatment.

Inside the capex decision

A number like $700 billion invites a lazy explanation: herd behavior, fear of missing out, a mania with better PR. It helps to picture how such a line actually gets defended. Imagine you are the hyperscaler executive who owns the AI capex bet. Before you sign, three things are put in front of you:

Capabilities. The candidate next-generation models that can enter the training pipeline this cycle: what they can already do in the lab, and where the curves are still moving.
Research roadmap. Which experiments are actually working, why the team believes the next architecture will pay off, and which specific bottlenecks the additional compute removes.
Demand by horizon. Expected adoption broken down by industry, segment, and use case at 1-year, 2-year, and 3-year horizons, with separate curves for inference, training, and enterprise integration.

These are not blind bets. A $700B line survives a board meeting only when all three of those slides look credible together, and the people reading them have information the rest of us lack: they have watched the next generation run in the lab, seen the internal capability curves, and read the demand signals from their largest customers. In a very literal sense, the boards approving this capex have already seen a piece of the future that consumers will meet as products two or three years later. That privileged view is exactly what makes the money readable as an instrument: when insiders with this much visibility keep raising the line, they are telling you what they saw.

The discipline cuts both ways, and that keeps the instrument honest. When the slides stop looking credible, the line gets cut, publicly, in an earnings call. Watching the slides is the planning skill.

How to use it from here

The decoder gives you a discipline for reading AI news. When a capability headline lands, ask which year’s money produced it. When a capex announcement lands, ask which years it will produce: serving capacity in two, models in three, breakthroughs in four. The eight currents that follow are the forces pulling on this machinery: whether the staircase keeps climbing (Current 1), whether efficiency strands the clusters (Current 2), whether the investment timeline survives contact with the revenue timeline (Current 3), whether a sovereign cuts the wire between you and your vendor (Current 4), whether any of it actually reaches production (Current 5), whether the hours-and-dollars math starts moving work (Current 6), whether the atoms of power, chips, and permits keep pace with the money (Current 7), and what politics does when the displacement math starts working (Current 8). Each current ends with trigger signals to watch, and the decoder is how you date a trigger when it fires.

Current 1

Continued Scaling
“Does the staircase keep climbing?”

The thesis

The capability staircase keeps stepping: a major jump roughly every two years, and the ~$700B guided for 2026 is priced on at least one more step above the Mythos / Spud generation now arriving. The money is committed either way; the outcomes will land on your roadmap whether or not you planned for them.

The risk

DeepSeek V3 achieved frontier performance at 10× less compute. If algorithmic efficiency leaps continue, today’s massive clusters could be overbuilt for training. But labs are betting inference will dominate, at roughly three-quarters of AI compute by 2030. The bet is on deployment scale as much as model size.

The bet, stated plainly

The capex decoder shows where the money goes and when it comes back as capability. This current is the bet that the spending keeps being right: that above the step the field just took, there is at least one more, and that the clusters being poured today will be the ones that train it.

Note what just happened to this bet: it began paying out. The 2024 cohort of capex was earmarked, in part, for models shipping 2026–2027, and on June 9, 2026, an early variant of the Mythos generation arrived as Claude Fable 5, with the full generation (Mythos proper, OpenAI’s Spud) still expected toward late 2026 and into 2027, right on the lag pipeline’s schedule. The live question is therefore twofold: whether the full deployment confirms the step the early variant previews, and whether the ~$700 billion of 2026 money buys a step above it, or the staircase flattens just as the most expensive clusters in history come online. That is what the trigger signals below are tuned to detect.

What is being built

The scale of individual projects is staggering. Elon Musk’s xAI deployed Colossus, a cluster of 100,000 GPUs, in just 122 days in late 2024. Project Rainier, the cluster Amazon built for Anthropic, spans roughly 500,000 chips. The Stargate project (a joint venture between OpenAI, Oracle, and SoftBank) targets 450,000 GPUs at its Abilene, Texas flagship and plans clusters exceeding one gigawatt of power, the equivalent of a nuclear power plant dedicated to AI, while Meta’s Hyperion campus in Louisiana is designed for five. At these scales, the limiting factor shifts from chip availability to raw electrical power: the Three Mile Island nuclear facility is being restarted specifically to supply AI data centers. Whether the atoms can keep pace with the money is its own force now; Current 7 takes up that story in full.

The inference bet

A common misunderstanding is that all this money is about training bigger models. In reality, the labs are increasingly betting that inference, running models at scale for millions of users, will dominate AI compute. Deloitte’s 2026 predictions already put inference at roughly two-thirds of all AI compute this year reported, and Brookfield’s infrastructure forecast has it at roughly 75% of AI compute demand by 2030 reported. Training a frontier model is a one-time cost; serving it to every enterprise customer, every developer, every consumer product is a continuous cost that scales with adoption. The capex surge is as much about building the serving infrastructure for AI-powered products as it is about training the next generation of models.

This is the core planning insight of Current 1: even if you believe the current generation of AI is “good enough,” the investment already committed will produce outcomes over the next 2–4 years. Those outcomes (faster models, cheaper inference, new capabilities) will change what’s possible and what’s expected. Your plans need to account for a moving target rather than a snapshot.

$416B

2025 Big Four capex

~$700B

2026 capex (guided)

~2 yr

Capability staircase cycle

~75%

AI compute for inference by 2030 (Brookfield)

Trigger signals: what to watch for

A generally available (non-gated) release of the Mythos / Spud generation lands, and within 60 days at least three independent benchmarks (LMArena, SWE-Bench Pro, RLI) place it a clear tier above the Opus 4.8 / GPT-5.5 cohort
Sector AI revenue (lab ARR plus hyperscaler AI-segment disclosures) grows faster than combined Big Four AI capex for two consecutive quarters, so the $500B+ gap narrows in absolute dollars, in filings
Hyperscaler capex guidance continues rising >30% year-over-year through 2027
A frontier lab ships a named post-transformer or novel-MoE architecture with published utilization or scaling gains on the new mega-clusters, and a second lab adopts it within two quarters

Trigger log · July 2026 edition

Fired early Next-gen capability jump, in variant form. June 9, 2026 · Claude Fable 5 released: the first Mythos-class model generally available, state-of-the-art on nearly every tested benchmark. Our readinginterpretation: this is the 2027 generation arriving early as a safeguard-wrapped variant (alignment handled on a software layer rather than final fine-tuning, plausibly pulled forward by the IPO calendar), so the jump is real but previewed, not confirmed. Watch the full Mythos / Spud deployment toward late 2026–2027; that firing is what re-arms this trigger for the generation above.
Update The preview was withdrawn, then re-gated. June 12 – July 2026 · Three days after launch, a federal directive took Fable 5 (and Mythos) dark worldwide; by early July access returned gated and expensive, with Mythos flowing through Project Glasswing for large US organizations and Fable via API tokens behind a moderation layer. The capability reading of this log’s first entry survives intact: the step exists, and the RLI measured it (16.1%, Current 6) before the restrictions landed. But capability existence and capability access have formally split; the access story now lives in Current 4.
Not yet Revenue gap closing. Anthropic’s run-rate revenue roughly quintupled in five months (Current 3), but spending is still growing faster than sector-wide AI revenue.

Implications by role

Developer

New capability jumps every ~2 years. Build with model-agnostic abstractions. What’s impossible today may be trivial in 18 months.

Team Lead

Plan for continuous retraining of team skills. Each model generation changes what tasks can be automated and how.

CTO / VP Eng

Justify continued AI investment with the staircase argument: current spend funds future capabilities, not today’s.

Procurement

Expect higher API costs initially for each new generation, dropping quickly as competition arrives. Avoid multi-year lock-in.

Data: company earnings reports & guidance (Q1 2026) • Big Four = Alphabet, Amazon, Meta, Microsoft

Current 2

Efficiency Revolution
“How Much Does GPT-4 Cost?”

The thesis

The cost to train a GPT-4-class model fell ~95% in under two years. The cost to run one fell 99%. When capability becomes a commodity, the moat moves from the model to everything built around it.

The risk

Cost compression assumes algorithmic efficiency continues compounding. If frontier capability requires genuinely new architectures (not just MoE optimizations), the floor may be higher than extrapolation suggests. Mistral’s bet only works if the ceiling is low.

Training cost to reach a stated capability tier

Model	Org	Released	Training cost	Capability claim
GPT-4	OpenAI	Mar 2023	>$100M	Frontier: set the “GPT-4 class” bar
Llama 3.1 405B	Meta	Jul 2024	~$60M (compute)	Matches GPT-4 on most public benchmarks
DeepSeek V3	DeepSeek	Dec 2024	$5.6M (final pre-training run)	Matches/beats GPT-4o on key benchmarks
DeepSeek R1	DeepSeek	Jan 2025	+$294K (RL on V3 base)	Matches OpenAI o1 on reasoning
GLM-5.1	Zhipu	Apr 2026	undisclosed	744B MoE / 40B active, MIT license; 58.4% SWE-Bench Pro (beats GPT-5.4 and Opus 4.6)
Mistral Medium 3.5	Mistral	Apr 2026	undisclosed	128B dense, self-hostable on Hugging Face; 77.6% SWE-Bench Verified
DeepSeek V4-Pro	DeepSeek	Apr 2026	undisclosed	1.6T total / 49B active, hybrid attention; 80.6% SWE-Bench Verified

DeepSeek’s $5.6M figure is the final pre-training run only; Epoch AI estimates the full base-to-R1 cost at $6–7M, and parent company High-Flyer invested $500M+ in GPUs total. R1 reuses V3’s pre-training, so it isn’t a from-scratch GPT-4-class run. The cleanest like-for-like comparison is GPT-4 → DeepSeek V3: roughly 95% reduction in 20 months. The April 2026 wave (GLM-5.1, Mistral Medium 3.5, DeepSeek V4-Pro, plus Qwen 3.6 and Kimi K2.6) extends the curve: training costs are not publicly disclosed, but the resulting capability sits within striking distance of, or above, paid frontier on coding and reasoning.

Inference price per million output tokens

Date	Frontier tier (closed)	Sub-frontier closed	Open-source equivalent
Mar 2023	GPT-4: $60	—	—
Nov 2023	GPT-4 Turbo: $30	—	—
May 2024	GPT-4o: $15	—	—
Jul 2024	—	GPT-4o-mini: $0.60	Llama 3 70B (Groq): ~$0.79
Oct 2024	GPT-4o (cut): $10	—	—
Jan 2025	—	—	DeepSeek V3: $0.42
Apr 2026	Opus 4.7: $25	—	DeepSeek V4-Pro: $2.48 (~10× cheaper at frontier-equivalent)
Apr 2026	—	—	Qwen 3.6 35B-A3B: self-host (single RTX 4090, 73.4% SWE-Bench)

Like-for-like at the frontier tier: GPT-4 ($60) → GPT-4o ($10) is an 83% drop in 19 months. The often-cited 99% drop compares GPT-4 launch ($60) to GPT-4o-mini ($0.60), a different (cheaper) capability tier. That comparison is still useful as a “what does GPT-4-class output cost today?” question, though it is a cross-tier comparison rather than a frontier-to-frontier one. By April 2026 the comparison that matters most for buyers is closed frontier vs open-weight frontier at the same capability tier: DeepSeek V4-Pro at $0.28 input / $2.48 output per million tokens is roughly an order of magnitude cheaper than Opus 4.6/4.7 on coding-and-reasoning workloads it can actually serve.

Hypothesis · not measured data

Mistral CEO Arthur Mensch’s thesis (paraphrased from early-2026 interviews): generic intelligence will commoditize, so competitive advantage moves to specialized systems built around your specific data and domain. Below are the layers he points to. Segment widths are equal; this is a stake-in-the-ground for discussion, not a measured value distribution:

Model

Fine-tune

Data & RAG

Tooling

Domain expertise

Discussion: If the model is free, which of these layers is your team actually investing in, and which would Mensch say you should be?

The training cost freefall

In March 2023, OpenAI trained GPT-4 for an estimated $63–100 million; Sam Altman confirmed publicly that the cost exceeded $100 million including research and development. By July 2024, Meta had trained Llama 3.1 405B, an open-weight model matching GPT-4 on most benchmarks, for roughly $60 million in compute (30.84 million H100 GPU-hours). Then in December 2024, DeepSeek released V3, a model that matched or exceeded GPT-4o on key benchmarks for just $5.6 million in GPU time.

That is a roughly 95% cost reduction in 20 months. A month later, DeepSeek released R1, which matched OpenAI’s o1 on reasoning tasks for an incremental $294,000 in training cost.

The caveats matter: DeepSeek’s $5.6 million figure covers only the final pre-training run. Their parent company High-Flyer invested over $500 million in Nvidia GPUs total, and the full cost from base model to R1 is estimated at $6–7 million by Epoch AI. But even the generous estimate represents a 90%+ reduction from GPT-4. The key innovations enabling this (FP8 mixed-precision training, mixture-of-experts architectures with load balancing, and custom CUDA kernels achieving 85%+ GPU utilization versus the industry average of 55–65%) are algorithmic rather than hardware-bound. They can be replicated.

The open-source convergence

The Stanford HAI 2025 AI Index Report documented the most important shift in the AI landscape: the performance gap between the best open-weight and proprietary models, measured by Chatbot Arena Elo ratings, shrank from 8.04% in January 2024 to 1.7% by February 2025, a 79% reduction in a single year. On MMLU specifically, the gap between US and Chinese models collapsed from 17.5 percentage points to just 0.3 between the end of 2023 and the end of 2024.

Llama 3.1 405B was the first open model to match or exceed GPT-4 across multiple benchmarks in July 2024, roughly 16 months after GPT-4’s release. By early 2025, that lag had compressed further. Open-source models now represent 62.8% of all models by count, and the best open LLMs lag closed ones by 5–22 months on benchmarks. One analysis projected open-closed parity by Q2 2026; that projection has aged poorly at the very top, where the 2026 AI Index shows the Elo gap re-widening to roughly 3.3% after the frontier stepped again. Read the convergence as a per-tier story: each capability tier commoditizes on the 5–22-month lag, while a genuinely new tier can re-open the gap above it.

Inference pricing in freefall

The pricing evolution of OpenAI’s own API tells the commoditization story in dollar terms. GPT-4 launched at $60 per million output tokens in March 2023. GPT-4 Turbo brought that down to $30 in November 2023. GPT-4o launched at $15 in May 2024, then was cut to $10 in October. Meanwhile, GPT-4o-mini offered GPT-4-class performance at $0.60 per million tokens, a 99% reduction from GPT-4’s launch price in under two years.

Open-source alternatives are even cheaper. Llama 3 70B via Groq costs roughly $0.79 per million output tokens. DeepSeek V3 is available at $0.42. Self-hosted 70B models on H100 hardware can reach approximately $0.07 per million tokens at full utilization. On average, open-source models cost 7.3 times less than their proprietary equivalents.

The April 2026 wave

Over an 18-day window in April 2026, three frontier-class open-weight coding models shipped. GLM-5.1 (Zhipu, April 7) is 744B MoE / 40B active under MIT license and posts 58.4% on SWE-Bench Pro, ahead of GPT-5.4 and Opus 4.6 on that bench. Qwen 3.6 (Alibaba, April 16) split into variants; the 35B-A3B open variant runs on a single RTX 4090 with quantization and posts 73.4% SWE-Bench Verified, the throughput sweet spot for solo developers and on-prem deployments. Kimi K2.6 (Moonshot, April 20) is 1T total / 32B active and introduces a 300-agent swarm primitive for parallel exploration on hard tickets, an agentic precursor that connects to Hours and Dollars in Section 6. DeepSeek V4-Pro (April 24) posts 80.6% SWE-Bench Verified at $0.28 input / $2.48 output per million tokens with a 1M-token context window, roughly an order of magnitude cheaper than Opus 4.6 at comparable coding capability, using hybrid attention (Compressed Sparse + Heavily Compressed) at about 27% of V3.2’s per-token FLOPs. Mistral Medium 3.5 (April 29) ships as a 128B dense model, self-hostable from Hugging Face, with 77.6% SWE-Bench Verified and configurable reasoning effort per request; it carries the cleanest EU-data-residency narrative on the market, paired with Mistral’s $400M ARR (January 2026) at a $13.8B valuation.

By mid-2026, the buyer question is no longer “is open-weight good enough.” It is which open-weight per workload, which hosting stack, and which sovereign deployment shape. For EU enterprises in particular, these intersect directly with the sovereignty story in Section 4: for the first time, “self-hostable frontier-equivalent” is a literal product description rather than a euphemism.

Where does value go when the model is free?

Mistral CEO Arthur Mensch has been the most articulate voice on this shift. Across early-2026 interviews (the Big Technology Podcast in January, Davos and Bloomberg the same month, the Economic Times in February) he framed AI as becoming infrastructure, “a utility” measured by efficiency, capital discipline, and reliable delivery rather than novelty. His most quoted line: “My generation of engineers has more or less succeeded in commoditizing its own profession.” The corollary, which he argues consistently, is that competitive advantage will increasingly accrue to whoever builds the most specialized system around their specific data and domain, rather than to whoever has the largest model.

If Mistral is right, and the cost data supports the argument, then the model itself becomes a commodity layer, and value migrates to the layers around it: fine-tuning and domain adaptation, data pipelines and retrieval-augmented generation, tooling and orchestration (agent frameworks, MCP servers, evaluation pipelines), and ultimately domain expertise. The organizations that win in this scenario are the ones that understand their own problems best, whichever model they happen to run.

95%

Training cost drop in 20 months

$0.28

DeepSeek V4-Pro input / MTok

128B

Mistral Medium 3.5 (self-hostable)

7.3×

Open-source cost advantage

Trigger signals: what to watch for

An open-weight model lands within 5 points of a closed frontier release on the same headline benchmarks ≤90 days after that release
A named Fortune-500 / CAC40 / DAX company states in an earnings call, filing, or press release that it moved a production workload from a proprietary API to open-weight, with a volume or cost figure
Inference costs drop below $0.10 per million tokens for GPT-4-class output
A hyperscaler explicitly cites model-efficiency gains as a reason for lowering capex or GPU-order guidance in an earnings call

Trigger log · July 2026 edition

Counter-signal Frontier prices went up, not down. June 9, 2026 · Claude Fable 5 launched at $10 input / $50 output per million tokens (double Opus 4.7), and its post-recall return added a gating layer on top of the price (API-only, moderated; see Current 4). The efficiency revolution is real one tier below the frontier; at the very top, a genuine capability jump still commands a premium, now with a scarcity component. Watch whether open-weight labs close this gap on the usual 5–22-month lag.
Not yet Sub-$0.10 GPT-4-class inference. DeepSeek V3 at $0.42 and self-hosted 70B at ~$0.07 (full utilization) bracket the threshold; no major API has crossed it at honest quality.

Implications by role

Developer

Open-source becomes the default stack. Invest in fine-tuning, RAG, and tooling skills rather than prompt engineering for a single vendor.

Team Lead

Build team expertise in the “static layer”: data pipelines, evaluation, deployment. Model expertise commoditizes fast.

CTO / VP Eng

Reduce vendor lock-in. The value shifts from which model you use to how you integrate it. Invest in infrastructure and domain specialization.

Procurement

Shift budgets from API costs to infrastructure and talent. Self-hosting becomes economically viable for high-volume workloads.

Data: OpenAI API pricing history • DeepSeek technical reports • Stanford HAI 2025 AI Index • Epoch AI

Current 3

Financial Correction
“Have We Seen This Before?”

The thesis

A financial correction kills companies, not technology. Amazon survived a 94% stock drop. The internet kept working through the crash; the funding disappeared. So take AI working as a given, and ask the open question: does the investment timeline match the revenue timeline?

The risk

The parallel breaks in important ways. Dot-com investors were startups burning VC cash. Today’s AI investors are the most profitable companies in history spending from earnings. Nasdaq P/E today is ~26× vs 60× at the dot-com peak. Calling it a bubble may be correct on timing but wrong on magnitude.

Hover over timeline events to explore the parallel • Click cards below to flip

Survivors vs. Casualties, then and now

Click any card to flip it and see what happened.

Survivor

Amazon

Stock: −94%

Peak $106 → trough $5.51

Revenue never stopped growing.

$2.76B (2000) → $8.49B (2005)

Had $1B cash from well-timed bond offering

Today: ~$2.5 trillion market cap

Casualty

Pets.com

Raised $300M

Revenue: $619K

Dead 268 days after IPO.

Spent $70M+ on ads

Negative unit economics from day one

Today: a cautionary tale

Today

OpenAI

Valuation: $852Breported

Revenue: $25B ARRreported

Projects −$14B loss in 2026.

900M+ weekly users

Profitable by 2029 (earliest)

IPO window 2026–27 at $1T+; Anthropic filed first (speculation)

Casualty

Stability AI

CEO resigned Mar 2024

Revenue: <$5M/quarter

Burn rate: $8M/month

~$100M in debt

Couldn’t compete on frontier model training costs

The AI-era Pets.com?

12,000

ChatGPT-sized products needed to justify current AI infrastructure capex (Barclays estimate)

The dot-com precedent

On March 10, 2000, the Nasdaq Composite reached an all-time high of 5,048.62. By October 9, 2002, it had fallen to 1,114, a 78% decline that destroyed over $5 trillion in market value. The Nasdaq didn’t close above 5,000 again until April 23, 2015, a recovery that took fifteen years. At the peak, venture capital investment had surged from roughly $7 billion in 1995 to nearly $100 billion in 2000, with internet companies absorbing 80% of all venture capital. Telecom companies invested more than $500 billion in infrastructure in the five years following the 1996 Telecommunications Act.

The lesson that most people take from this period is: “it was a bubble and it burst.” The more useful lesson is that the survivors and casualties were distinguished by one thing: real revenue, real customers, and cash to survive a funding drought. The quality of their technology mattered surprisingly little.

The bubble argument has matured

The most visible bear voice of the past two years, Ed Zitron, has been notable not for being right but for how the argument has had to shift. His original case, sustained across blog posts and his “Better Offline” podcast, was economic: AI was a value-destruction machine, hyperscaler capex was insane, and the unit economics simply did not work. Some of that case has aged well (the ROI gap is real). Most of it has not. Frontier-tier inference prices fell roughly 83% in 19 months (99% if you accept a cheaper capability tier, a distinction Current 2 unpacks). Anthropic’s annualized revenue passed OpenAI’s in April 2026 ($30B vs $25B)reported and reached a reported $47B run-rate by mid-May. Cost decline plus revenue growth made the original economic argument harder to sustain in its strongest form. Kelsey Piper, writing in The Argument, documented the shift: Zitron’s case has migrated from “the economics don’t work” toward fraud and accounting allegations against OpenAI and the hyperscalers.

The bear case is still alive, and parts of it remain sharp. But the goalposts moved, and that itself is a signal worth weighing. A bubble argument that survives collapsing costs and compounding revenue by switching from economics to fraud is a weaker argument than one that didn’t have to switch. Hold the correction scenario open; don’t hold this particular version of it as the bear case.

Amazon vs. Pets.com

Amazon’s stock fell 94% from roughly $106 in December 1999 to about $5.51 in late 2001. Yet its revenue grew every single year through the crash: $2.76 billion in 2000, $3.12 billion in 2001, $5.26 billion in 2003, $8.49 billion in 2005. It posted its first profitable quarter in Q4 2001 and its first full profitable year in 2003, with $35 million net income on $5.26 billion revenue. The key decision was a well-timed $1.25 billion bond offering that gave Amazon $1 billion in cash to survive the drought. Today it is worth roughly $2.5 trillion, over 800 times its trough market cap.

Pets.com raised $300 million total, spent over $70 million on advertising while generating only $619,000 in revenue, and shut down 268 days after its IPO. Webvan burned through $1.5 billion building automated warehouses before filing bankruptcy. Boo.com raised $135 million, burned it in 18 months, and sold its assets for under $2 million. The common thread: negative unit economics, no path to profitability, and complete dependence on the next funding round.

AI investment has entered unprecedented territory

The combined Big Four capex (Alphabet, Amazon, Meta, Microsoft) grew from roughly $140 billion in 2023 to $228 billion in 2024 (+62% year-over-year), to $416 billion in 2025, to roughly $700 billion guided for 2026, the same series the capex decoder plots. Capital intensity has reached historically unprecedented shares of revenue for these companies. Venture funding has concentrated similarly: global AI VC funding grew from roughly $45–50 billion in 2022 to $211 billion in 2025, the first year AI startups captured more than half (52.7%) of all global venture deal value.

OpenAI reached an $852 billion post-money valuationreported after its $122 billion funding round in March 2026. Annualized revenue hit $25 billion by February 2026reported, up from roughly $2 billion in 2023. But the company projects a $14–17 billion loss in 2026projected, is not expected to be profitable until 2029 at the earliest, and has committed $600 billion in compute spending through 2030. Anthropic’s trajectory has been even steeper: revenue grew from $1 billion ARR in December 2024 to roughly $9 billion at the end of 2025 to a $47 billion run-rate by mid-May 2026reported, its Series H closed at $965 billion post-moneyreported, and on June 1, 2026 it filed a confidential S-1, the most concrete IPO signal short of a roadshow.

The revenue gap

Sequoia partner David Cahn published “AI’s $600B Question” in June 2024, calculating that the AI infrastructure buildout requires roughly $600 billion in annual end-user revenue to justify itself. At the time, actual AI product revenue was roughly $100 billion, leaving a $500 billion annual gap. Since then, both spending and revenue have grown, but spending has grown far faster: capex roughly tripled while the revenue gap has likely widened, not narrowed. Barclays estimated that current capex levels would require the equivalent of 12,000 ChatGPT-sized products to break even.

Personal value is clear. Enterprise value is the open question.

Two ROI stories sit on top of each other and are routinely conflated. Individual subscribers buying Claude or ChatGPT at $20–$200 a month report value clearly and stickily: paid consumer plans for the two leaders together cross tens of millions of seats by mid-2026, churn is unremarkable, and surveys consistently show personal users describing meaningful time savings. That part of the market has answered. The enterprise market has not.

Omdia’s October 2025 survey of 350 mid-to-large enterprises reported “very good” to “extraordinary” ROI from most respondents, a genuinely positive signal. Accenture, in parallel, found 61% of enterprise AI subscriptions underutilized due to poor integration. The MIT NANDA study reported 95% of organizations seeing zero return, with the measurement caveats already noted (no baselines, six-month cutoff, parallel-pilot designs). Reconcile these and the picture is: enterprises that have integrated AI into workflows are extracting real value; the majority that are still trying are not. The model can usually do the task, so the gap has little to do with capability. The bottleneck is how the model gets wired into the workflow. That is the subject of Section 5.

Vendor concentration is the under-discussed risk

Q1 2026 saw AI venture funding concentrate to a degree that has no recent precedent in software. OpenAI, Anthropic, and xAI accounted for roughly 67.3% of all AI venture funding across more than 1,500 deals. OpenAI’s $122 billion round at an $852 billion valuation consumed a non-trivial share of global venture capacity. Microsoft, Meta, Amazon, and Alphabet collectively guided investors toward ~$700 billion of capex in 2026. Three foundation-model labs sit on top of a stack the rest of the industry rents from.

Concentration this extreme is usually argued as safety: the giants won’t fail. The Anthropic-Pentagon situation (covered in Section 4) is the case to study before agreeing. A single sovereign decision in February 2026 severed access to a major AI vendor for the entire US federal government, mid-contract, with little notice. The technology kept working. The vendor didn’t fail. The buyer simply couldn’t buy. June 12 then supplied the general case: one directive, and every user of a model lost it at once (Current 4). That is a vendor-concentration failure mode the dot-com analogy didn’t have. Stress-testing your AI strategy against vendor severance is now first-class planning work, not paranoia.

Why the parallel breaks, and why it might not matter

There are important differences from the dot-com era. Today’s leading AI investors are massively profitable companies spending from earnings, not startups burning venture capital. Nasdaq forward price-to-earnings ratios are approximately 26 times versus 60 times at the dot-com peak. Enterprise adoption is far more advanced: the large majority of big enterprises have implemented AI in some form, even if mostly in pilots. But the core structural risk, investment dramatically outpacing revenue realization, is identical. And new risks have emerged: AI-related corporate debt has ballooned to $1.2 trillion (JPMorgan), GPU rental prices have already fallen roughly 70% from peak, and the real useful life of GPU infrastructure may be 2–3 years rather than the 5–6 years used for accounting depreciation.

AI is valuable; that part is settled. The planning question is whether your specific vendors, tools, and providers are the Amazon or the Pets.com of this cycle.

$500–600B

Annual AI revenue gap

94%

Amazon’s stock drop (survived)

268 days

Pets.com IPO to shutdown

20%

Enterprises reporting AI-driven revenue (Deloitte)

$1.2T

AI-related corporate debt

Trigger signals: what to watch for

OpenAI or Anthropic IPO valuations correct significantly (>30%) within 6 months of listing
Hyperscaler capex guidance flattens or declines for the first time since 2022
≥3 AI-native startups that each raised >$100M fail or get acqui-hired within one quarter (the Inflection / Character.AI pattern)
A further ≥30% H100/B200 spot-rate decline within 12 months on published marketplace indexes, or sustained sub-$1.00/GPU-hour rates for a full quarter (H100 rates already ~70% off peak)
A default or impairment ≥$1B on AI-infrastructure debt, reported in filings or rating-agency actions (the CoreWeave-style stranded-asset scenario)

Trigger log · July 2026 edition

Armed The IPO trigger is now live. June 1, 2026 · Anthropic filed a confidential S-1 (Series H closed at $965B post-money; run-rate revenue ~$47B in mid-May, up from ~$9B at the end of 2025)reported. The “>30% correction within 6 months of listing” signal goes from hypothetical to measurable the day it trades. Mark the calendar; this is the cleanest trigger in the booklet.
Not yet Capex flattening. All four hyperscalers raised 2026 guidance in the Q1 reporting cycle. The opposite of this trigger.

Implications by role

Developer

Diversify beyond AI-only skills. The developers who survived the dot-com bust were the ones who could build real products, not just demos.

Team Lead

Every AI project must have measurable ROI. “We’re exploring AI” won’t survive a budget cut. Show business value now.

CTO / VP Eng

Stress-test vendor viability. Which of your AI vendors is Amazon (real revenue, cash reserves) and which is Pets.com?

Procurement

Negotiate shorter contracts. Avoid lock-in with providers who may not exist in 18 months. Prefer pay-as-you-go over committed spend.

Data: Nasdaq historical data • Sequoia “AI’s $600B Question” • Barclays Research • MIT NANDA, Deloitte, Omdia, Accenture enterprise surveys • Kelsey Piper / The Argument

Current 4

Sovereignty
“What if your vendor isn’t allowed to sell to you?”

The thesis

Vendor access can collapse for political or jurisdictional reasons faster than for technical ones. The on-prem option is now real: Chinese open-weight is at frontier parity, Mistral ships dense and self-hostable, and Meta’s exit from open-weight frontier has shifted the “Linux of AI” mantle to a stack EU enterprises can actually deploy.

The risk

Self-hosting trades vendor risk for operational risk. Frontier-equivalent open-weight isn’t free to run, and the eval/safety burden moves onto you. Some open-weight options carry dataset-provenance questions that surface only after a regulator looks closely. And the openness itself is positional: Meta and Mistral both closed when their position allowed, so Chinese open-weight tracking the frontier is a strategy that can change, not a promise.

Two things changed in the last four months: a major US AI vendor was severed from its largest federal customer by executive action, and the Chinese open-weight stack closed the gap on coding and reasoning at roughly 10× lower cost. Sovereignty stopped being a paranoid’s concern.

Feb–Apr 2026

Anthropic–Pentagon timeline (designation → injunction → appeal → workaround)

$0.28 / $2.48

DeepSeek V4-Pro per MTok

744B

GLM-5.1, MIT licensed

128B dense

Mistral Medium 3.5 (self-hostable)

1 GPU

Qwen 3.6 35B-A3B on RTX 4090

Five anchors. The first is the failure mode; the next four are the alternatives that now exist.

Two collisions, one pattern

Two events in spring 2026 turned sovereignty from hypothetical to operational. The first: on February 27, 2026, the US Defense Secretary designated Anthropic a “supply chain risk,” and the Trump administration ordered federal agencies to stop using Claude. Anthropic and the Pentagon had signed a $200 million contract in July 2025 under Anthropic’s acceptable-use policy; the Pentagon wanted “all lawful purposes” access without limitation, and Anthropic refused to remove restrictions on autonomous weapons and domestic mass surveillance. Anthropic sued, won a preliminary injunction in late March (Judge Rita Lin called the designation “Orwellian” and First Amendment retaliation), then lost an appeals court bid in early April. As of June 2026 the litigation is still live, and the designation has proven narrower in practice than first implied: Anthropic argued, and Microsoft agreed, that it cannot reach customers outside the defense contracts themselves, so Claude remained available through M365, GitHub, and AI Foundry. But the headline lesson stands: the technology never stopped working. The buyer simply could not buy.

The second: on April 8, Meta launched Muse Spark, its first proprietary closed-weight model, from Meta Superintelligence Labs under Alexandr Wang. After nearly a decade of public commitment to open frontier AI, Meta’s frontier development is now closed. Existing Llama models remain available but no longer evolve. The combination of $115–135B in 2026 capex, competitive pressure from Chinese labs building commercial products on top of Llama, and the strategic goal of a deeply integrated “personal superintelligence” tied to Meta’s user data drove the shift. Yann LeCun, Meta’s most visible open-source advocate, departed in November 2025. The “Linux of AI” thesis did not survive contact with $100B+ compute economics, at least at the Western frontier.

June 12: the collision generalized

Then, days before this edition closed, the pattern escalated from procurement to existence. On June 12, 2026, three days after the Fable 5 launch, the US government had Anthropic switch off Claude Mythos 5 and Fable 5 for all users worldwide, under a directive scoped by foreign nationality that, being unverifiable in practice, forced a global takedown. By early July the capability was coming back, but gated and priced: full Mythos access limited to large US organizations through Project Glasswing; Fable restored only through API tokens, at premium prices, behind a strong moderation layer. February’s lesson was that one buyer can lose its vendor. June’s lesson is stronger: every buyer can lose the model at once, and access comes back, when it comes back, on the sovereign’s terms, tier by tier. The companion booklet was written in the hours after that morning; it treats June 12 not as an outage but as a dress rehearsal.

The Chinese open-weight wave fills the gap

Within 18 days in April 2026, three frontier-class open-weight coding models shipped from Chinese labs: GLM-5.1 (Zhipu, MIT-licensed 744B MoE / 40B active, 58.4% SWE-Bench Pro), Kimi K2.6 (Moonshot, 1T total / 32B active with a 300-agent swarm primitive), and DeepSeek V4-Pro (1.6T total / 49B active, 80.6% SWE-Bench Verified at roughly an order of magnitude lower output cost than Opus 4.6). Qwen 3.6 (Alibaba, April 16) split into variants; the 35B-A3B open variant runs on a single RTX 4090 with quantization. By workload as of mid-2026: DeepSeek V4-Pro for cheap large-context coding agents, Kimi K2.6 for hard multi-step tickets with its swarm primitive, GLM-5.1 for self-hosted production where MIT licensing matters, Qwen 3.6-35B-A3B for local laptop or single-GPU deployment. Llama 4 remains integration-default but is no longer evolving.

What this means for EU enterprises

The buyer question has shifted from “is open-weight good enough” to a multi-part procurement question: which open-weight per workload, which hosting stack, which sovereign deployment shape. Mistral Medium 3.5 (128B dense, self-hostable, $400M ARR at a $13.8B valuation) is the cleanest EU-data-residency narrative on the market: dense rather than MoE, easier to deploy than the Chinese stacks, sovereign-aligned. Self-hosted Chinese open-weight is the other major option, with two caveats worth flagging: geopolitical exposure if procurement frameworks tighten, and dataset-provenance questions that some EU regulators are starting to ask.

The deeper point: for the first time in this booklet’s lifetime, “self-hostable frontier-equivalent” is not a euphemism. Procurement teams that previously assumed one or two US vendors had no realistic alternative now have several. The work shifts from negotiating with one vendor to architecting around the choice. That choice depends on which sovereign failure modes you weight highest.

Where this current continues

Sovereignty is the one current where this booklet deliberately caps its own depth: the companion booklet, The Mercantilism of Generative AI, treats it at book length. If you take two pieces of it, take these. First, the armed-versus-targeted test: which industries get frontier access on favorable terms because the factory-holder can never become them, and which get the bloc’s champion pointed at their customers because their margin is the cognitive work the factory produces. Second, the “open is a position, not a principle” mechanism: the reason this chapter prices the Chinese open-weight option as a strategy that can change rather than a guarantee, and the source of the cleanest single tell in either booklet: the day a leading Chinese lab ships its best model closed is the day it believes it has taken the lead. The companion’s dated bets are wired into this booklet’s chips section.

Trigger signals: what to watch for

Additional supply-chain designations, recalls, or AUP collisions between frontier labs and sovereign buyers (fired June 12; see log)
New sovereign-AI regulation requiring on-shore inference, training data, or model weights is enacted (not proposed) in the EU, a member state, or a G20 economy
A Chinese open-weight release lands within 5 points of the closed frontier on its headline benchmarks ≤90 days after the closed release; and the inverse tell: a leading Chinese lab shipping its flagship closed fires this current, not Current 2 (see mercantilism Bet 5)
A named EU enterprise or top-10 EU systems integrator reports a production migration to sovereign or self-hosted stacks with a volume figure, or >25% of its new AI deployments sovereign-hosted
Counter-trigger: a signed agreement that formally rescinds an existing severance, designation, or recall (a full, ungated June 12 restoration would count)

Trigger log · July 2026 edition

Fired The kill switch, proven, then metered back on. June 12, 2026 · A federal directive scoped by foreign nationality (unverifiable in practice) had Anthropic switch off Mythos 5 and Fable 5 worldwide, three days after launch. By early July, access returned gated: full Mythos only for large US organizations via Project Glasswing; Fable API-only, at premium prices, behind a strong moderation layer. This is the second severance event in five months; the first trigger above has now fired twice. The companion booklet’s Bet 1 (gated return by end of 2026) fired within weeks of being written.
Plot twist The severed vendor is now a government partner. June 9, 2026 · Four months after the supply-chain designation, Anthropic shipped Claude Mythos 5, the strongest cybersecurity model in the world, to cyberdefenders through Project Glasswing, in collaboration with the US government, while the Pentagon litigation continues. Three days later the same capability was switched off worldwide (above). Sovereign access gets renegotiated capability by capability, demonstrably in both directions. Plan for that rather than for a clean “on/off” switch.

Implications by role

Developer

Hostable open-weight as default for sensitive workloads. Build retrieval and tool layers model-agnostic; the model is now portable.

Team Lead

Maintain a portable inference layer behind your eval and retrieval pipelines. Treat vendor swap as a planned exercise, not a fire drill.

CTO / VP Eng

Multi-vendor strategy is no longer paranoia. Quantify single-vendor severance risk against revenue and regulatory exposure.

Procurement

Contracts should anticipate vendor severance. Insist on data-residency guarantees and explicit exit clauses; price in a second supplier.

Data: court filings (Anthropic v. DoD) • vendor releases (Meta Muse Spark, Mistral Medium 3.5) • open-weight model cards (DeepSeek V4-Pro, GLM-5.1, Qwen 3.6, Kimi K2.6) • Kelsey Piper / The Argument

Current 5

From Lab to Production
“What we learned from 2015, and what’s different now”

The thesis

The bottleneck has moved from model capability to deployment. Same shape as the 2015 ML/DS production gap: statisticians could build models but not ship them. Now it’s LLMs, and the talent & tooling layer hasn’t caught up to the capability.

The risk

This time is different. Less data-pipeline work (LLMs generate, they don’t process). Much more testing and validation work (the damage potential is qualitatively higher). The MLOps cost curves don’t transfer cleanly: the new shape is eval-heavy, not ETL-heavy.

The capability ceiling overstates what you can deploy. The deployment floor is where the gap actually sits, and that floor is what enterprise AI roadmaps now hit first.

40 pts

Medical: 92% lab → 52% real-world (83-study meta-analysis)

71 pts

Coding: 97% HumanEval → 26% SWE-Lancer

20%

Enterprises reporting AI revenue impact (Deloitte 2026)

61%

Underutilized AI subscriptions (Accenture)

95%

MIT NANDA zero-ROI rate (measurement caveats)

Five lab-to-real-world gaps. The pattern is consistent across modalities, and it points away from capability as the binding constraint.

The 2015 parallel

Anyone who lived through machine learning’s enterprise adoption between 2014 and 2018 has seen this shape before. Statisticians and data scientists arrived from math and statistics backgrounds, fluent in modeling but uneven in software engineering. They built models in notebooks; the models worked in the notebook; the models did not ship. The gap was real and load-bearing rather than a fashionable complaint. The eventual resolution was a decade of work on MLOps, feature stores, model registries, and cross-functional teams pairing data scientists with software engineers and platform people. The capability was always there. The path from capability to production took the better part of ten years to build out.

The LLM gap has the same shape and is hitting enterprises hard right now. Capability has run ahead of the operational maturity to deploy it. Pilots multiply; production deployments lag. The MIT NANDA “95% zero ROI” figure has measurement issues, but even with conservative reframings the underlying message is correct: most enterprises haven’t finished the deployment side. The 61% of AI subscriptions Accenture identified as “underutilized due to poor integration” is the same story stated more carefully. The same talent and tooling gap. Same response: build the bridge.

What’s different this time

The 2015 parallel is the right scaffold but it isn’t a copy-paste. Two things are materially different, and they should reshape how teams budget the bridge.

First, far less data-pipeline work. The 2015 ML era spent enormous effort on data engineering: ETL, feature pipelines, training-serving skew, feature stores. LLMs invert most of that. They generate outputs from unstructured inputs rather than processing structured data; the data layer is retrieval and context assembly, not feature transformation; high-volume ETL is largely not the bottleneck. Teams that assume their LLM deployment needs an MLOps-shaped data team will mis-budget. The work is real but it sits elsewhere.

Second, far more testing and validation work. This is the part most enterprises systematically underestimate. An LLM can confabulate plausibly, an agent can take actions, output reaches end-users directly, and the damage potential of an undetected failure is qualitatively higher than “our regression test set drifted.” The work that was once 10–15% of MLOps spend (evaluation, monitoring, output review) becomes first-class infrastructure. Eval pipelines, red-teaming, calibration of human review, behavior change-management when a model upgrades: this is the deployment work itself. Teams that staff the bridge with the MLOps shape will discover the bridge is built wrong.

The benchmark-to-deployment gap, quantified

The same pattern shows up in every domain that has been measured carefully. GPT-4 achieved 92% diagnostic accuracy in controlled medical studies, but a meta-analysis across 83 studies found only 52.1% overall AI diagnostic accuracy in real-world settings, nearly a 40-point gap. (The two numbers come from different studies, models, and task designs, so don’t treat the subtraction as a measurement; the consistent shape, lab score far above field score, is the finding.) On the SWE-Lancer benchmark of real freelance coding tasks, even top models succeed only 26.2% of the time despite near-perfect HumanEval scores. On RE-Bench long-horizon tasks, AI systems score 4× higher than humans at 2 hours but humans outperform AI 2:1 at 32 hours. Deloitte’s 2026 enterprise survey found 20% of enterprises reporting AI-driven revenue, with two-thirds still stuck in pilot. Every one of these numbers describes a deployment problem, and none of them a capability problem.

Regulation as a secondary force raising the floor

Regulation plays a supporting role in this current rather than the headline, but it is a real second-order force, and one piece of news clarifies how to weight it. On May 7, 2026, EU negotiators reached provisional political agreement on the Digital Omnibus on AI: Annex III high-risk obligations are postponed from August 2, 2026 to December 2, 2027 (a 16-month deferral), and Annex I product-regulated high-risk obligations are deferred from August 2, 2027 to August 2, 2028. Watermarking and AI-content transparency shift by only three months, to December 2, 2026.

The delay should not reduce urgency for buyers. General-purpose AI model obligations under Articles 50–55 are unchanged and continue on the original schedule. Article 5 prohibitions are already in force. The Article 4 AI literacy obligation is already binding. Standards and guidance will still publish close to the new deadlines. The Code of Practice on synthetic content is expected to finalise in May or June 2026. What the omnibus moved was the most expensive, most operationally heavy obligations, precisely the ones tied to deployment of high-risk systems. The deployment gap is the headline; regulation is the floor underneath it, which the omnibus moved but did not remove.

Copyright runs in parallel. The Bartz v. Anthropic case produced a $1.5 billion class-wide settlement. The New York Times v. OpenAI multi-district litigation is still grinding through discovery: the expected spring 2026 summary judgment did not materialize; instead, a January 2026 ruling forced OpenAI to hand over a 20-million-conversation sample of ChatGPT logs, which will shape the fair-use fight. There are 56+ ongoing copyright lawsuits against AI companies. Every settlement raises the floor of data-provenance and due-diligence work required to deploy an LLM in production. Treat this as friction on the deployment side, not as a separate force.

What teams that bridge the gap actually look like

The 2015 resolution was cross-functional teams: data scientist plus software engineer plus platform engineer. The 2026 resolution looks similar in shape but reweighted: evaluation engineers, red-team specialists, and workflow designers become first-class roles. Less feature-store work; more behavioral testing. Less ETL; more output review. Less concept drift; more model upgrades that change personality. The teams that ship LLM features into production at scale in 2026 are the ones that have already staffed this shape. Most haven’t.

40 / 71 pts

Medical / coding lab-to-real-world gaps

20%

Enterprises reporting AI revenue (Deloitte)

61%

Underutilized AI subscriptions (Accenture)

+16 mo

EU Annex III delay (Digital Omnibus, May 2026)

$1.5B

Bartz v. Anthropic settlement

Trigger signals: what to watch for

First EU AI Act enforcement actions under remaining-on-schedule obligations (GPAI / Article 5)
Major copyright ruling against AI training (NYT v. OpenAI summary judgment; the expected H1 2026 date already slipped, now watch H2 2026, see log)
Your own internal pilot-to-production conversion rate stays below 20% across two quarters
Evaluation / red-team line items appear in the majority of enterprise AI RFPs you receive in a quarter, or a major analyst framework (Gartner, Forrester) adds them as a named category
Counter-trigger: an eval + monitoring + retrieval suite ships as a default bundled feature of at least two hyperscaler AI platforms (Bedrock, AI Foundry, Vertex) at no extra line item

Trigger log · July 2026 edition

Moved NYT v. OpenAI summary judgment slipped. June 2026 · The merits ruling expected in spring did not land; the case is still in discovery, with a January order forcing OpenAI to produce 20 million ChatGPT conversation logs. The copyright floor under deployment keeps rising on settlements (Bartz: $1.5B), not yet on precedent.
Not yet EU enforcement. No first enforcement action yet under the obligations that stayed on schedule (GPAI, Article 5). The Digital Omnibus deferral (May 2026) bought deployers time on Annex III; it did not change this trigger.

Implications by role

Developer

Reliability and observability beat capability chasing. Invest in eval harnesses, behavior tests, and output review tooling earlier than feels necessary.

Team Lead

Staff evaluation and red-team roles as first-class, not as “someone’s side responsibility.” The bridge to production is mostly testing work.

CTO / VP Eng

Don’t copy your MLOps org chart. Reweight from data engineering toward evaluation engineering. The deployment cost curve is differently shaped.

Procurement

Require vendors to ship eval and monitoring tooling alongside the model. Factor the May 2026 omnibus delay into compliance budgets but don’t draw down readiness.

Data: SWE-Lancer / SWE-Bench leaderboards • medical AI meta-analysis (83 studies) • Deloitte 2026, Accenture, MIT NANDA enterprise surveys • EU AI Act + Digital Omnibus (May 2026) • Bartz v. Anthropic settlement

Current 6

Hours and Dollars
“The two units that will decide displacement”

The thesis

Capability is becoming a function of two things employers can actually measure: how many hours an agent can work undisturbed, and what it costs per hour compared to the human it would replace. Both are improving fast. Parameter counts and MMLU don’t enter the conversation.

The risk

Theatrical demos overstate the second unit. Sub-agent swarms and 12-hour OS-build runs are illustrative, not yet operational at scale. Error compounding still kills long workflows in production. SWE-bench 94% / SWE-Lancer 26% remains the cautionary split.

Stop arguing about IQ benchmarks. The displacement curve to watch is hours of undisturbed work on the X axis, and cost per autonomous task-hour on the Y axis. That is how an employer will price an agent against a person.

8 h+

Target threshold: one undisturbed knowledge-worker day

~3 mo

METR doubling from 2024 onward (TH 1.1, Jan 2026)

16.1%

RLI automation rate, Fable 5, Jul 2026 (2.5% at launch)

~$8 / hr

Opus 4.7 autonomous coding (moderate load, ~1M tok/hr)

~$16 / hr

Fable 5 same workflow (2× price, ~2× RLI capability)

$130–245 / hr

Loaded US senior dev (EU: €100–150/hr)

Today, a frontier-Opus autonomous coding hour costs roughly an order of magnitude less than the human hour it would replace, before review overhead. After review overhead and retries, the net gap is narrower, but still wide enough to matter.

Two units

The conversation about AI capability is changing because the people who buy capability are not benchmark researchers. Employers do not care whether a model added two points on MMLU. They care about two numbers. First: how many hours an agent can work on something autonomously before a human needs to step in. Second: what an hour of that autonomous work costs, in API tokens or compute, compared to the loaded hourly rate of the person it would otherwise be done by. Those two numbers, multiplied, are the displacement math. The first is improving observably on a roughly four-month doubling cadence. The second is collapsing on the curve Section 2 covers. The product of the two is what will decide which work moves and which doesn’t.

Where the autonomy is today

The first unit is no longer hypothetical. The observable artifacts as of mid-2026: Claude Code routinely runs multi-hour autonomous coding sessions; Anthropic’s Computer Use lets an agent drive a desktop directly; Cursor (with Composer 2.5) and Windsurf (bundling Devin Cloud) sell agentic coding by the hour, not by the demo. Google demonstrated Antigravity 2.0 by having 93 sub-agents generate 2.6 billion tokens to build the core framework of an operating system in roughly 12 hours: theatre, yes, but the artifact existed. Gemini Spark, announced at I/O on May 19, is a personal agent that runs cloud-side 24/7 even when the user’s device is off. Anthropic’s “Dreaming” feature gives models memory consolidation across long-running work. None of these tools clear the 8-hour undisturbed-work threshold reliably yet, but several can sustain hours of useful autonomous work in narrow domains. That was not true a year ago.

The cost comparison

The displacement framing only works if the second unit lines up with the first. Take a representative knowledge-worker task that takes a senior practitioner about eight hours: a mid-complexity coding ticket with tests, or a structured analysis with document synthesis. Today’s math, using May 2026 prices and observed Claude Code token telemetry:

Claude Opus 4.7 at $5 input / $25 output per million tokens. A moderately loaded autonomous coding agent burns on the order of 1M tokens per hour (typically ~700K input, ~300K output, with most input cached). At cached-input pricing, that lands at roughly $8 per agent-hour. An 8-hour autonomous run lands near $65–$80. A heavy multi-tool autonomous workload pushing 3–5M tokens per hour pushes per-hour cost into the $25–$45 range.
Claude Sonnet 4.6 at $3 input / $15 output per million tokens. Same workload at moderate load: roughly $5 per agent-hour (the ~300K output tokens alone cost $4.50); an 8-hour run lands near $35–$45. Heavy load reaches $15–$25 per hour.
Claude Fable 5 (July 2026 update) at $10 input / $50 output per million tokens, API-only post-recall. Same moderate load: roughly $16 per agent-hour (the ~300K output tokens alone cost $15); an 8-hour run lands near $130–$160. The premium is priced against measured capability; see “What the Fable 5 pricing does to the math” below.
Senior developer comparator: in the US, loaded hourly cost typically lands at $130–$245 per hour (base $110–$175 plus 20–40% loaded for benefits, taxes, overhead). In Western Europe the equivalent runs €100–€150 fully loaded; in CEE roughly half that. Per 8-hour day, that’s $1,000–$2,000 US / €800–€1,200 Western EU.

The raw ratio, before any overhead, is striking: an autonomous Opus 4.7 hour costs roughly 15–30× less than the senior US developer hour it would replace; an autonomous Sonnet 4.6 hour costs roughly 30–50× less; even a Fable 5 hour, the most expensive agent-hour on the market, comes in roughly 8–15× below the human comparator. Most observers stop here, get excited, and reach the wrong conclusion. The honest number adds review and retry overhead: every autonomous hour today realistically needs roughly 20–40 minutes of human supervision, evaluation runs, and retry cycles before the work ships. That overhead compresses the effective gap into something more like 3–10× cheaper for the right workflow, and to break-even or worse for workflows where the model still gets stuck.

Section 2 explains why the underlying token gap closes faster than people expect: frontier-equivalent inference costs fell ~99% in two years and the April 2026 open-weight wave (DeepSeek V4-Pro at $0.28/$2.48 per MTok) is another factor of 10 below paid frontier. The crossover for any specific workflow depends mostly on two things: how much human supervision overhead it still needs, and what the loaded-cost comparator actually is in your geography. Pick one workflow your team actually does. Estimate both numbers. Track the ratio quarterly. That ratio, more than any benchmark, will tell you when displacement becomes economic.

The METR data point

One supporting data point is worth keeping in view. METR (Model Evaluation & Threat Research) publishes a measured benchmark called Time Horizon: the duration of human work a model can complete with 50% reliability. Their 2019–2025 dataset showed the frontier doubling roughly every 7 months. Their TH 1.1 update (January 2026) expanded the task suite by a third and doubled the number of 8-hour-plus tasks; measured from 2024 onward, the doubling time comes out near 3 months (89 days). The anchors move fast: Claude 3.7 Sonnet (early 2025) sat around a 50-minute horizon, while METR’s February–March 2026 pilot with the frontier labs reported the strongest agents near or beyond the reliable measurement range of the benchmark itself. If the ~3–4-month doubling holds, frontier models cross the 8-hour threshold by late 2026 or early 2027; the ruler may run out before the calendar does. If the older 7-month cadence reasserts, that slips toward 2028. Either way, the trajectory is the input to the displacement curve, not the headline number.

The Remote Labor Index: displacement gets a leaderboard

As of July 2026, this current has the instrument it was missing. The Remote Labor Index (Center for AI Safety and Scale AI) is built from real commissioned freelance projects: more than 6,000 hours of professional work worth over $140,000, across eight domains from CAD and architecture to data analysis and animation. Its metric is exactly this chapter’s question in benchmark form: the automation rate, the share of projects where the AI agent’s deliverable is judged as good as the paid professional’s. Unlike SWE-bench or MMLU, the units are dollars and deliverables, not points.

The curve so far: 2.5% at launch in late 2025; 4.17% by the spring (Opus 4.6 running in an agentic scaffold); then on July 1, 2026, CAIS published the first Mythos-class result: Fable 5 at 16.1% (measured on 218 of 240 projects before the access restrictions landed), roughly double Opus 4.8’s 8.3%, with GPT-5.5 at 6.3%. The frontier more than quadrupled in under eight months. Two readings of that number are simultaneously true, and both matter. Read pessimistically: 84% of real, paid remote-work projects still can’t be automated end-to-end, and even headline deliverables often wouldn’t ship as finished client work; that is the lab-to-production gap of Current 5, measured in invoices. Read as a trajectory: this is the METR doubling cadence showing up in a dollar-denominated instrument, and it converts the “hours” unit of this chapter into something an employer can put in a spreadsheet.

Field note: my own hours and dollars

One data point from my own desk, dated early July 2026, for whatever it is worth. In the GPT-4 era I used models as assistants: a snippet here, a draft there, with me doing all the assembling. Through the Opus generations the ratio shifted; the model handled real stretches of work while I still contributed a large share of the judgment, review, and glue. With Fable 5 I felt the “hours” unit jump a tier in a single release. The week this edition shipped, we rebuilt most of the booklet together, and I could honestly have watched movies all day while it happened: the model ran the research, the drafting, the fact-checking loops, and most of the editing itself. My monthly spend tells the same story from the dollars side: roughly $20 a month in the GPT-4 era, $200 a month through the Opus generation, and now heading well past $500. I have never paid more per month, and an hour of finished work has never cost me less. That crossover, felt at one desk, is this whole current in miniature.

What the Fable 5 pricing does to the math

The June frontier release also moved the second unit, upward, which is why this section’s numbers need re-basing. Fable 5 launched at $10 input / $50 output per million tokens, double Opus 4.7 on both meters, and its post-recall availability is API-only with a strong moderation layer. The same moderate autonomous load (~1M tokens per hour) lands at roughly $16 per agent-hour; the ~300K output tokens alone cost $15. But price both sides of the trade: at roughly 2× the price it posts roughly 2× the RLI automation rate of Opus 4.8, so capability per dollar at the very top held flat-to-better even as the absolute price rose. This is Current 2’s two-lane market seen from the buyer’s chair: the frontier lane charges a premium that, for now, buys measurably more finished work per dollar on the hardest tasks, while one tier down, the commodity lane keeps collapsing toward hardware cost.

What this implies for capex

If “hours of autonomous work” is the right capability axis, the capex picture from Section 1 changes shape. The training portion of hyperscaler spend (the largest pre-training runs) becomes harder to justify on its own; smaller models with strong post-training, RL, and tool use can match the capabilities of larger ones for many workflows. The inference portion, projected at roughly 75% of AI compute demand by 2030 reported, becomes more justified, because every long-running agent is a multi-token, multi-call, often multi-hour inference workload. Reasoning models with extended thinking can use 10–100× more tokens per task than chat-style models. The capex stays justified; what shifts is its allocation between training and serving.

16.1%

RLI automation rate (Fable 5); frontier >4× in 8 months

$5–16 / hr

Agent-hour, Sonnet 4.6 → Fable 5 (moderate, ~1M tok/hr)

$130–245

US senior dev loaded hourly comparator

3–10×

Net displacement gap after review & retry overhead

94% ↔ 26%

SWE-bench Verified vs SWE-Lancer (production gap)

Trigger signals: what to watch for

8h+ undisturbed autonomy on an RLI- or METR-style task suite at net agent-cost below 25% of the loaded human comparator after review and retry overhead (today the gross gap is large; the net gap closes when supervision overhead drops)
An RLI-style measured automation rate crosses 25% on the published task pool, the next arming threshold after July’s 16.1%
A named enterprise publishes a multi-day agent workflow running in production with cost or ROI figures: in an earnings call, a numbered case study, or an audited report
A hyperscaler discloses the inference share of AI capex, or states that inference spend now exceeds training spend, in an earnings call
A vendor publishes its own agent cost-per-hour benchmark alongside accuracy benchmarks (the RLI now supplies the third-party version; this trigger is the vendor’s own disclosure)
Counter-trigger: net agent cost-per-hour (after supervision overhead) stays above human comparator on representative tasks for two consecutive frontier releases

Trigger log · July 2026 edition

Fired Displacement got a leaderboard, and the frontier quadrupled on it. July 1, 2026 · CAIS published Fable 5 at a 16.1% RLI automation rate (218/240 projects, measured before the access restrictions landed): roughly 2× Opus 4.8, more than 4× the published leader of eight months earlier. The first dollar-denominated, third-party instrument for this current’s two units is live and moving fast. Next arming threshold: 25%.
Confirmed The doubling cadence held, and may be faster. Jan–Mar 2026 · METR’s TH 1.1 puts the 2024-onward doubling near 3 months, and their spring pilot reported frontier agents near or beyond the benchmark’s reliable measurement range. The first unit (hours) is moving at the fast end of this chapter’s assumptions.
Not yet Cost-per-hour benchmarks. No major vendor publishes agent cost-per-hour alongside accuracy yet. The RLI supplies the third-party version, but you still have to compute your own, which is exactly what the Team Lead card below asks you to do.

Implications by role

Developer

Reliability and observability matter more than prompt cleverness. Retry logic, intermediate checkpoints, tool-call audit trails. The hard part is the long tail, not the happy path.

Team Lead

Pick one representative team workflow. Estimate (a) autonomous hours, (b) agent $/hr (today ~$16 Fable 5 / ~$8 Opus / ~$5 Sonnet, plus your supervision overhead), (c) loaded human hourly rate. Track that ratio quarterly; it’s your displacement curve.

CTO / VP Eng

Plan for inference-heavy workloads. Budget and architecture should assume agentic workflows with 10–100× more tokens per task, not chat.

Procurement

Vendor evaluation should include long-horizon task benchmarks and cost-per-hour, not just single-prompt scores. Ask for production reliability data, not just leaderboard positions.

Data: METR Time Horizons (TH 1.0 Mar 2025, TH 1.1 Jan 2026) • Remote Labor Index (CAIS / Scale AI, remotelabor.ai, Jul 2026) • SWE-bench Verified / SWE-Lancer leaderboards • Antigravity 2.0 demo writeups • Anthropic API pricing (May–Jul 2026) • Claude Code usage telemetry • Index.dev / MarsDevs developer hourly-rate surveys 2026

Current 7

The Physical Substrate
“Can the atoms keep up with the bits?”

The thesis

Every other current in this booklet quietly assumes the atoms show up. The ~$700B of 2026 capex is a claim on physical things: advanced wafers from one island, high-bandwidth memory from three vendors, gigawatts from a grid whose connection queue is measured in years. The binding constraint is shifting from capital to physics and permits: money stopped being the scarce input before the wafers and megawatts did.

The risk

Concentration is the discontinuity. TSMC fabricates roughly 90% of the world’s most advanced logic (effectively all the accelerators frontier models train on) behind one strait, one packaging technology, one memory oligopoly. A Taiwan disruption is the single event that would fire triggers in all seven other currents at once, and this current moves in step-functions, not curves. The counter-risk: doomers have under-priced adaptation before, and on-site generation, diversified fabs, and Current 2’s efficiency gains all push back.

The decoder dates money into capability. This current asks whether the physical middle of that pipeline (fabs, packaging, memory, power, permits) can actually execute on the schedule the money assumes. The slowest clock wins.

~90%

Of most-advanced logic chips fabbed by TSMC

~2,300 GW

Generation + storage waiting in US interconnection queues (LBNL)

3× capped

Consecutive PJM capacity auctions clearing at the price cap ($333/MW-day, Dec 2025)

300+

Data-center bills filed in 30+ states, first six weeks of 2026

61.8→134 GW

US data-center grid demand, 2025→2030 (451 Research)

Five anchors: one concentration, one queue, one price signal, one political wave, one demand curve.

The bit-atom mismatch

Every timescale in the capex decoder is really a physical timescale wearing a financial costume. Building a leading-edge fab takes three to five years. A gigawatt-scale grid interconnection can take most of a decade in congested regions. A data center takes 12–24 months if the power is already there. A frontier training run takes months. The decoder’s clean arithmetic (money in year N, capability in year N+2 to N+4) holds only as long as the slowest of those clocks doesn’t slip. This current watches the clocks.

Power: the queue is the moat

The Lawrence Berkeley National Laboratory’s queue tracker counts roughly 2,300 gigawatts of generation and storage waiting for US grid interconnection, nearly twice the country’s entire installed capacity, with typical waits measured in years. Meanwhile the demand side compounds: the LBNL/DOE energy report has data centers at 4.4% of US electricity in 2023, projected to 6.7–12% by 2028, and 451 Research has data-center grid demand more than doubling from 61.8 GW in 2025 to 134 GW by 2030. The market has already priced the collision. PJM, the grid operator for the world’s densest data-center corridor, has now cleared three consecutive capacity auctions at the regulatory price cap; the December 2025 auction settled at $333.44/MW-day and would have cleared near $530 without the cap, with the market monitor attributing data centers as the primary driver of roughly 40% of that auction’s $16.4 billion cost. This is why hyperscalers are buying power plants rather than power: the Three Mile Island Unit 1 restart (now the Crane Clean Energy Center) exists because Microsoft signed a 20-year deal for all 835 MW of it, and it is tracking toward a second-half-2027 restart, a year ahead of the original plan. The first US nuclear reactor brought back from retirement specifically to feed AI is a fact the 2015 version of this industry would have found unbelievable.

Chips: one island, one packager, three memory vendors

The silicon side concentrates even harder. TSMC fabricates roughly 90% of the world’s most advanced logic chips, and its advanced nodes (the ones AI accelerators use) now account for about 74% of its wafer revenue, with price increases landing across all of them. The bottleneck within the bottleneck is advanced packaging: CoWoS capacity is fully booked with lead times of 52–78 weeks, demand approaching a million wafers in 2026, and Nvidia alone holding an estimated 60% of it. High-bandwidth memory is a three-vendor oligopoly in which SK Hynix holds roughly 62%, and HBM3E is effectively sold out for all of 2026. Read those three sentences against the decoder: when the inputs to capability are pre-sold 12–18 months out, capex stops being a dial you can turn and becomes a queue position you defend.

The politics of power arrived

Through 2025 this current was an engineering story. In 2026 it grew a political layer, fast. More than 300 data-center bills were filed in over 30 states in the first six weeks of the year (up from about 200 bills in all of 2025), with at least a dozen states proposing outright moratoria. Maine’s legislature actually passed the first statewide ban (LD 307, blocking data centers over 20 MW); Governor Mills vetoed it in April and the override failed by seven votes. New York’s legislature passed a one-year moratorium of its own, awaiting signature as this edition went to press. The pattern to watch is not any single bill but the direction: retail electricity bills are becoming the political transmission mechanism, since IEEFA attributes 63% of the price increase in PJM’s 2025/26 auction, some $9.3 billion recoverable from ratepayers, to data centers, and voters pay retail bills. In parallel, the federal layer began treating compute itself as a controlled substance: rules effective January 15, 2026 codified case-by-case export review for advanced AI chips and imposed a 25% tariff on advanced AI chips not destined for the American supply chain. The companion booklet’s “bullion is compute” chapter argues this is exactly where mercantilist control naturally lands, because concentrated compute is the one input that cannot leak.

Why this is a current, not a footnote to Current 1

Current 1 asks whether the money keeps being right. This current asks whether the money can be spent. Those are different failure modes with different signatures: Current 1 fails through disappointing models, Current 3 through disappearing funding, and Current 7 through slipping dates, regional scarcity, and price premiums that have nothing to do with capability. It was promoted from background assumption to full current in this edition because 2026 is the year its trigger surface became observable: capacity auctions at caps, moratorium bills in a dozen statehouses, packaging lead times crossing a year and a half. The atoms got political.

52–78 wk

CoWoS advanced-packaging lead times (2026)

Sold out

HBM3E for 2026; SK Hynix holds ~62% of HBM

835 MW

TMI Unit 1 restart, 100% to Microsoft (H2 2027)

$9.3B

Data centers’ share of PJM 2025/26 price increase, passed to ratepayers (IEEFA)

25%

US tariff on advanced AI chips leaving the US supply chain (Jan 2026)

Trigger signals: what to watch for

A hyperscaler names power, interconnection, or permitting as the reason for a capex cut or a named-project delay of ≥2 quarters in an earnings call
A US state enacts a statewide data-center moratorium or >20 MW ban: Maine’s passed-then-vetoed LD 307 and New York’s passed-but-unsigned bill show how close this trigger sits
Nvidia, AMD, or a hyperscaler names HBM or advanced-packaging allocation as a shipment-gating factor in an earnings call, or GPU lead times publicly exceed 52 weeks
Compute concentration becomes a regulated quantity: a rule, threshold, or reporting regime targeting cluster size or total deployed compute as such (the companion booklet’s Bet 3, watched from here)
Counter-trigger: ≥5 GW of new US data-center capacity energized in a single year via behind-the-meter or on-site generation, meaning the queue stops being the constraint

Trigger log · July 2026 edition

Fired The politics of power arrived. Feb–Jun 2026 · 300+ state bills in six weeks; a dozen states with moratorium proposals; Maine’s first-in-nation ban passed, was vetoed, and survived the override vote by only seven votes; New York’s one-year moratorium passed and awaits signature; PJM cleared its third consecutive auction at the price cap with data centers named the primary driver. The physical substrate now has a voting constituency. The enactment trigger above is one signature away.
Not yet Compute as a regulated quantity. The January 15 export rules and tariff regulate where advanced chips may go, and control keeps climbing the stack (chips → models, per June 12), but no rule yet targets cluster size or total deployed compute as such. When it lands, log it here and in Current 4.

Implications by role

Developer

Treat capacity as a first-class failure mode, like network partitions: regional inference fallbacks, graceful degradation under rate limits, multi-region routing. Scarcity shows up as 429s before it shows up in headlines.

Team Lead

GPU and capacity reservations now carry procurement lead times measured in quarters. Put them on the project plan like any long-lead dependency: after the requirements, before the sprint.

CTO / VP Eng

Vendor due diligence now includes where the inference physically runs and on what power contracts. Run one Taiwan-event tabletop exercise and record which of your workloads dies first; that list is your real exposure.

Procurement

Make allocation-shortfall and energy-cost pass-through clauses explicit in AI contracts. A vendor who won’t discuss its power and capacity exposure is making it your risk without your consent.

Data: TSMC quarterly reports • LBNL “Queued Up” (Dec 2025) & US Data Center Energy Usage Report • PJM capacity auction results / IEEFA / PJM market monitor • S&P Global 451 Research • Silicon Analysts foundry allocation Q1 2026 • Constellation–Microsoft PPA • MultiState / Good Jobs First legislation trackers • BIS rule & proclamation eff. Jan 15, 2026 • Companion: The Mercantilism of Generative AI (compute as bullion)

Current 8

The Political Economy of Displacement
“What happens when Current 6’s math starts working?”

The thesis

Current 6 tracks whether the displacement math works. This current tracks whether it will be allowed to keep working. Displacement politics can move faster than displacement economics: one legible number (the RLI just made automation rates a headline statistic), one bad summer of layoffs, one election, and rules arrive in a single legislative session. The vendors themselves believe this: OpenAI is proposing robot taxes and public wealth funds before any government has asked.

The risk

The backlash may stay theatrical. Official channels barely register AI displacement: eleven months into New York’s first-in-nation disclosure rule, zero of 162 mass-layoff filings checked the AI box. Polling anxiety hasn’t become voting behavior, and taxing “AI labor” is definitionally hard: nobody can yet specify the taxable unit. This current can simmer for years, and then move all at once.

The gap between what’s happening and what’s officially recorded is this current’s defining measurement, and its fuse.

−16%

Employment, 22–25-year-olds in most AI-exposed occupations (ADP / “Canaries” study)

0 / 162

NY mass-layoff filings attributing cuts to AI, 11 months into the disclosure rule

87,714

AI-attributed US job cuts, Jan–May 2026 (Challenger), vs 54,836 in all of 2025

75%

Americans who expect AI to reduce jobs over the next decade (Gallup/Bentley, 2023)

$500M+

Raised toward the $1B RAISE US retraining fund (Jun 2026)

Five anchors: a labor-market signal, an attribution gap, a press tally, a sentiment number, and the industry pre-paying its political bill.

Strip the statistics away and this current asks a plain question: when AI starts doing real work, what do the people whose work it was, and the politicians who represent them, do about it? Current 6 watches whether the automation math works. This current watches the human response to it: workers organizing, companies hiding the ball, vendors quietly pre-paying their political bills, legislators hunting for something to tax. Four scenes carry the story.

The canaries stopped singing

The cleanest evidence that displacement is real comes from the Stanford Digital Economy Lab’s “Canaries in the Coal Mine” study, run on ADP payroll data covering millions of workers. Its core finding is simple: AI absorbs tasks before it absorbs jobs, and entry-level tasks first. Since generative AI arrived, employment of the youngest workers (22–25) in the most AI-exposed occupations has fallen roughly 16%, while older workers in the same occupations at the same firms grew. Read that again: same firm, same job title, opposite fates, depending on whether your daily work was the entry-level slice. The follow-up dashboard shows the decline still accelerating through spring 2026, and the Stanford AI Index adds the sharpest sectoral cut: employment of young software developers down nearly 20% since 2024. This is Current 6’s economics arriving in payroll data, youngest cohort first, exactly where the hours-and-dollars math said it would bite.

Attribution laundering

Now put that evidence next to the official record, because the two tell opposite stories. New York became the first state to require employers to say whether AI caused a mass layoff. Eleven months in, zero of 162 filings have said yes. Over the same stretch, the running press tally of publicly AI-attributed job cuts (Challenger, Gray & Christmas) reached 87,714 in the first five months of 2026, more than in all of 2025. Both numbers can be true at once because attribution is a choice: a layoff filed as “restructuring” is legally cleaner and reputationally cheaper than one filed as “automation.” Amazon cutting roughly 30,000 corporate roles while spending $125 billion on AI capex, with reported internal plans to avoid some 600,000 future hires through robotics and AI, is the same move at its largest scale. The planning consequence is unusual: official statistics will systematically understate this current, so its triggers have to watch other venues (filing gaps, union demands, private funds) rather than the headline numbers.

Two things I keep hearing in practice

In my consulting and training work, two beliefs about this current come up constantly, and I want to flag both. The first is that today’s job market, especially in IT, is simply a hangover from the over-hiring of the covid years: companies gorged on headcount in 2021–22, and what we are watching now is the diet. Honestly, I don’t know how much of the current picture that explains. It is a real effect. It is also a very comfortable story, because it requires nobody to update anything. What I can say is that the payroll evidence above, with the youngest workers in the most AI-exposed occupations diverging from older colleagues inside the same firms, does not look like a covid correction to me. A covid correction doesn’t care what your occupation’s AI exposure is.

The second belief is a mindset flaw about speed. People calibrate on previous technology waves: it took marketing departments something like fifteen years to be forced to learn social media, so surely there is time to adjust to this too. I think that clock is wrong. This wave is compressed from both ends. On the supply side, a handful of hyperscalers push capability out at the pace of Current 1’s capex machine, with a new tier landing every couple of years and prices collapsing between tiers. On the demand side sit enterprises that have waited decades for exactly this: the ability to automate creative office work, the white-collar tasks that every previous automation wave, which mostly reached blue-collar and routine work, could never touch. When an eager supply side meets a demand side this motivated, the fifteen-year adjustment window collapses to a few years. If you are pacing your career or your team’s plans on how long digital transformation took, you are using the wrong clock.

The tax question arrives before the tax

The most telling 2026 development is who moved first. On April 6, OpenAI published economic-policy proposals that include a potential robot tax, a “Public Wealth Fund” giving Americans an automatic stake in AI companies and infrastructure, and a subsidized four-day workweek. Sit with that for a second: the company building the technology is designing its own redistribution mechanism, unprompted. In June, the RAISE US workforce fund launched with more than $500 million raised toward a $1 billion goal, anchored by Amazon, Anthropic, Microsoft, and the OpenAI Foundation. Meanwhile the definitional problem keeps actual legislation stuck: as Bloomberg Law’s survey of tax authorities put it, governments want to tax AI but cannot yet define the unit. Tokens? Agent-hours? Compute? Displaced heads? When an industry starts pre-paying a political bill nobody has formally presented, it is telling you what it expects this current to deliver.

Labor’s playbook already exists

The response side is not starting from zero. SAG-AFTRA’s 2023 AI provisions, the Las Vegas Culinary Union’s technology-severance terms ($2,000 per year worked for tech-displaced workers), and the longshoremen’s January 2025 contract restricting port automation all predate the agent era. In 2026 the playbook entered knowledge work: a one-day strike at ProPublica in April with AI use as a central issue, a New York Times union letter calling the paper’s AI policies inadequate the day before, an AP union unfair-labor-practice complaint over AI the day before that, and a December 2025 arbitration win at Politico for launching AI products without consulting the union. AI clauses are becoming a normal bargaining demand. None of this is yet a 10,000-worker action, but the templates are written, tested, and circulating.

Why this is a current, not a chapter of Current 6

Current 6 measures the economics; this current tracks the response function, and response functions are discontinuous where economics are smooth. The RLI is the hinge between them: the same instrument that tells employers the math is starting to work tells legislators it is working. Note the bookkeeping implication, because it matters for the synthesis: when one RLI publication lands in both chapters’ trigger logs, that is one event wearing two jackets. It counts once under the one-event rule, not as a fired pair.

3.8%/yr

Decline rate, young workers in exposed occupations (accelerating from 2.8%)

~20%

Drop in 22–25 software-dev employment since 2024 (AI Index 2026)

Apr 6, 2026

OpenAI proposes robot tax, public wealth fund, 4-day week

$2,000/yr

Tech-displacement severance per year worked (Culinary Union template)

28,300

Jobs in NY WARN filings, none attributed to AI

Trigger signals: what to watch for

A G7 economy enacts (not proposes) a levy, tax, or mandatory reporting regime whose unit is AI usage: tokens, agent-hours, compute, or displacement headcount
The attribution dam breaks: at least one company formally attributes a mass layoff to AI in an official government filing, or a jurisdiction makes AI attribution mandatory with penalties
A strike or collective action of ≥10,000 workers where AI use or displacement is the stated primary issue, with AI terms appearing in the settlement
A professional licensing body (bar association, medical board, accounting institute) in a major economy adopts binding rules restricting AI performance of licensed work: adopted rules, not draft guidance
Counter-trigger: two consecutive quarters of official labor statistics (BLS / Eurostat) showing AI-exposed occupations tracking the general labor market with no divergence, meaning the canaries recover

Trigger log · July 2026 edition

Fired The vendors are pre-negotiating the backlash. Apr–Jun 2026 · OpenAI proposed robot taxes, a public wealth fund, and a subsidized four-day workweek (April 6); the RAISE US retraining fund launched with $500M+ from Amazon, Anthropic, Microsoft, and the OpenAI Foundation (June 25). The industry is designing redistribution before being forced to; that is the clearest evidence available of what insiders expect this current to deliver.
Not yet Displacement is happening; attribution is not. Eleven months of New York’s disclosure rule: 0 of 162 filings checked the AI box, while Challenger’s AI-attributed tally hit 87,714 by May. The gap between the press number and the filings number is the attribution-laundering measure. Watch the gap, not either number alone; the trigger fires when the first filing closes it.

Implications by role

Developer

The exposed tasks are entry-tasks. Move up the supervision and orchestration stack, and keep evidence of your judgment work, not just your output. The canaries data says seniority in decisions, not years, is the moat.

Team Lead

Document augment-vs-replace decisions as you make them. They may become reportable, or taxable, retroactively, and the paper trail protects you in both directions.

CTO / VP Eng

Instrument agent usage now (hours, tokens, workflows, headcount deltas) so a future reporting or tax regime is a query, not a re-architecture. Treat “where agents run” as a jurisdictional variable, like data residency.

Procurement / People

Union and works-council consultation duties (especially in the EU) increasingly attach to AI deployment. Check the notification obligations before the rollout announcement, not after; Politico lost that arbitration.

Data: Brynjolfsson, Chandar & Chen “Canaries in the Coal Mine” + Canaries Dashboard (Stanford Digital Economy Lab / ADP) • Stanford HAI AI Index 2026 • Challenger, Gray & Christmas • NY DOL WARN filings • Gallup/Bentley 2023 • Bloomberg Law • OpenAI economic blueprint (Apr 2026) • RAISE US fund • Poynter / Partnership on AI union-agreement tracking • Remote Labor Index (CAIS)

Synthesis

How the Currents Interact

The eight currents are not independent, and reading them one at a time is the most natural mistake to make with this booklet. People argue about scaling, or the bubble, or sovereignty, one lens at a time. But planning errors rarely live inside a single current; they live in the cross-terms, the combinations whose consequences are not the sum of their parts. This section names the seven combinations most worth holding in your head, each with the trigger pair that signals it and the move it should provoke.

The operating rule

A single trigger fires → reweight. Shift attention and budget toward that current; no structural change yet. A pair fires within the same quarter → restructure. Pairs are where strategy actually changes; that is what the composites below are for.

One event, one count. The refinement this edition adds: a pair only justifies restructuring if its two triggers trace to independent events. The currents share underlying machinery (capex above all), so a single headline can fire triggers in several chapters at once. The June 9 Fable 5 launch alone touched Currents 1, 2, and 6; the June 12 recall touched 1 and 4; one capex-flattening announcement could plausibly fire 1, 2, 3, and 7 in the same morning. That is one signal wearing several jackets, not a pair. The test: would trigger B still have fired if event A hadn’t happened? If the answer is no, reweight; don’t restructure. The table below is the tool for running that test on the most common ambiguous headlines.

The basic wiring first. Efficiency (2) arms Sovereignty (4): cheap frontier-equivalent open-weight is what turns on-shore deployment from slogan into procurement choice. Hours and Dollars (6) leans on From Lab to Production (5): eval and supervision burden scales with task duration. Scaling (1) and Efficiency (2) pull capex in opposite directions but resolve through the training-versus-serving split: train-cluster spend gets harder to defend, serving-cluster spend easier. And Scaling (1) and Correction (3) look like opposites but are not mutually exclusive: the technology can work, products can earn real revenue, and the investment timeline can still miss the revenue timeline. The two currents new to this edition slot in at the ends of the chain. The Physical Substrate (7) sits upstream of everything: Current 1’s clusters, 2’s price floor, and 3’s capex all assume the atoms show up on schedule, and 7 is where that assumption gets tested. The Political Economy of Displacement (8) sits downstream of Hours and Dollars (6): 6 measures whether the displacement math works; 8 tracks whether it is allowed to keep working. That’s the wiring. The composites are what happens when current actually flows through it.

Ambiguous headlines: which current actually fired?

Headline	Could touch	The questions that disambiguate
“Hyperscaler capex flattens”	1 · 2 · 3 · 7	(1) What reason does the earnings call give: demand or financing (→3), efficiency gains reducing hardware need (→2), power or permitting (→7)? (2) Did the model roadmap slip with it (→1) or stay intact (pure 3/7)?
“Frontier price moves sharply”	1 · 2 · 6	(1) Is a new capability tier attached (→1, the two-lane premium) or is it the same tier getting cheaper (→2)? (2) Does $/agent-hour cross an employer’s comparator threshold (→6)?
“Big open-weight release near frontier”	2 · 4	(1) Does it change your $/task at the tier you actually use (→2)? (2) Does its license or origin change your sovereign fallback (→4)? And note the inverse: a leading Chinese lab shipping its flagship closed is a 4-only signal, the mercantilist tell, not an efficiency event.
“Vendor severed / model recalled”	4 · 3 · 1	(1) Political-jurisdictional origin (→4) or balance-sheet origin (→3)? (2) Did the capability leave the market (→1 wobbles) or only one buyer’s access to it (→4 only)?
“Mass layoffs attributed to AI”	6 · 8	(1) Do official filings attribute it to AI with task-level economics (→6, the math working) or only the press release (→8, salience, or attribution laundering)? (2) Is there a legislative or union response within the quarter (→8 firing)?

Run a headline through its row before logging a trigger. One event that touches four currents is still one event; the pair rule needs two rows, not one row counted twice.

Current 1 (Continued Scaling) × Current 3 (Financial Correction)

A. The Successful Bubble

Watch forTwo triggers from two different chapters landing within months of each other. From Continued Scaling: the Mythos / Spud generation deploys fully and the capability step is confirmed, undeniable rather than arguable. From Financial Correction: a listed AI lab corrects more than 30% within six months of its IPO, or hyperscaler capex guidance flattens for the first time since 2022.

The interaction termThese two feel like they cannot both be true: if the technology delivers, surely the investment holds? The dot-com years say otherwise, and that is the single most useful thing the 2000 crash can teach. The internet worked. Traffic grew straight through the crash; Amazon’s revenue grew every single year while its stock lost 94%. What collapsed was the willingness to fund the next round, not the technology itself. Now apply the decoder’s logic: the models of 2026–28 are already paid for by 2024–25 capex, and a correction in 2027 cannot reach back and un-spend that money. So this composite world combines two things that feel contradictory: capability keeps arriving on schedule (the last guaranteed wave) while funding for the wave after it dries up. Two practical consequences follow. First, you still have to adopt: the crash does not make Mythos-class capability disappear, and competitors who adopt during the drought will do it at distressed prices. Second, who you buy from suddenly matters enormously, because the vendor landscape thins out fast; the survivors will be the ones with real revenue and cash on hand, exactly the Amazon test from Current 3.

First moveSort your vendor list into “survives a two-year funding drought” and “doesn’t” before both triggers fire; revenue, cash position, and burn rate are knowable today. Then time any long-term lock-in to the capability wave that is already funded, not the one that depends on 2027 money showing up.

Current 1 (Continued Scaling) × Current 2 (Efficiency Revolution)

B. The Two-Speed Frontier

Watch forFrom Continued Scaling: the staircase steps again at the very top, a new frontier generation with a clear capability jump. From the Efficiency Revolution: open-weight models matching the previous step within weeks rather than months. The June 2026 logs already hint at this shape: Fable 5 launched at a price premium (Current 2’s counter-signal), while one tier below it, the April open-weight wave keeps commoditizing on schedule.

The interaction termThe intuitive expectation is that these two forces cancel out: scaling pushes prices up, efficiency pushes them down, and you land somewhere in the middle. What actually happens is different: the market splits into two lanes moving at different speeds. Picture an escalator and a plateau. On the escalator are the workloads that genuinely benefit from each new capability step: frontier research, complex multi-hour agentic work, tasks at the edge of what is currently possible. These keep paying premium prices, because each new step does something the previous one could not. On the plateau is everything else (summarization, extraction, classification, routine drafting), work that was effectively “solved” a generation or two ago. For plateau work, open-weight catches up and the price collapses toward hardware cost. The planning failure this composite exposes: companies budget as if their whole portfolio were on the escalator (“everything should use the best model”), while in reality the large majority of enterprise workloads sit comfortably on the plateau. That mispricing is invisible in any single quarter and compounds every year the split widens.

First moveWalk your AI workflows one by one and tag each: escalator or plateau? The honest test: would this workflow’s output measurably improve with the next frontier generation? Then re-price the plateau workloads against open-weight alternatives this quarter; that is usually where the immediate savings hide.

Current 2 (Efficiency Revolution) × Current 4 (Sovereignty)

C. The Sovereign Dividend, and Its Residue

Watch forFrom the Efficiency Revolution: open-weight parity at the capability tier your workloads actually need, meaning parity for your tasks rather than benchmark parity in the abstract. From Sovereignty: any new severance event, supply-chain designation, or data-residency rule that touches your jurisdiction or your vendor.

The interaction termThe first half is good news, and the booklet has covered it: efficiency is what makes sovereignty affordable. As long as frontier capability lived only behind one or two US-hosted APIs, “sovereign AI” was a slogan; there was nothing equivalent to host yourself. Once open-weight reaches your required tier at a tenth of the cost, the escape hatch becomes real: sensitive workloads can move on-shore or in-house without giving up meaningful capability. When Current 2 fires, it arms Current 4’s exit. But there is a second half that most sovereignty plans miss, and it follows directly from composite B. Commoditization arrives bottom-up: plateau workloads gain credible open-weight fallbacks first; escalator workloads last, or never, if the frontier stays closed and premium. So as open-weight improves, your sovereignty exposure does not shrink evenly across the portfolio. It drains away from the routine work and pools in exactly the workloads that are hardest to replace: the highest-value, frontier-dependent ones. The irony is sharp: the better open-weight gets, the more your remaining lock-in concentrates where a severance event would hurt the most.

First moveFor each escalator workflow, write down the degraded-but-acceptable open-weight fallback: which model, what capability you would lose, and what that loss costs per month. Then actually run it once, before anyone forces the question. A fallback that has never been tested is a hope, not a hedge.

Current 6 (Hours and Dollars) × Current 5 (From Lab to Production)

D. The Autonomy Trap

Watch forFrom Hours and Dollars: agent time-horizons keep doubling every three to four months; METR’s data has already confirmed this cadence. From the Lab-to-Production side: your own pilot-to-production conversion rate sits below 20% for two consecutive quarters. Note that the second trigger is internal: you are watching your own organization, not the news.

The interaction termOn the surface this looks like good news arriving faster than you can use it: agents improve, your deployment pipeline lags, so you will simply adopt a little later. The actual dynamic is worse, and it is worth slowing down on. Supervision and evaluation burden scales with task duration. An agent that works for twenty minutes can be checked by reading its output. An agent that works alone for eight hours needs eval harnesses, intermediate checkpoints, audit trails of its tool calls, and people trained to review long chains of decisions. Each doubling of autonomous hours roughly doubles what competent supervision requires. Which means the organization that has not built the eval muscle is falling behind at the doubling rate, even while it feels like standing still. By the time 8-hour agents arrive, the supervision bar will sit far above today’s, and that capacity takes quarters to build: hiring, tooling, process, habits. This is why waiting feels prudent and is actually the trap. “We’ll adopt agents once they’re more capable” quietly translates to “we’ll start building supervision capacity at exactly the moment the requirement peaks.” Current 6’s displacement economics only become real for organizations that did Current 5’s homework in advance.

First moveStaff evaluation and red-team capacity now, and size it for the agent autonomy of two doublings from today (roughly six to eight months out), not for the agents you currently run. If hiring is hard, start with one person whose entire job is eval infrastructure; that is the bottleneck role.

Current 3 (Financial Correction) × Current 4 (Sovereignty)

E. The Cheap Hedge

Watch forEither of two very different headlines about a vendor you depend on. The financial one, from Financial Correction: the funding dries up, the burn rate wins, the Pets.com outcome. The political one, from Sovereignty: a sovereign decision severs access mid-contract, the February 2026 outcome. The point is less to predict which one than to notice that either is possible, and that both arrive without much warning.

The interaction termHere is what makes this pair special: from where you sit as a buyer, the two failure modes are operationally identical. In both cases the API stops answering, or the contract becomes unusable, mid-quarter, through no fault of yours. One failure comes from a balance sheet, the other from an executive order, but your incident channel looks exactly the same that morning. And when two distinct risks produce the same operational failure, they can share a single mitigation. That mitigation is well-defined: a portable inference layer (your retrieval, evaluation, and tooling stack built so that the model behind it can be swapped without rebuilding everything above it), plus contracts with explicit exit clauses and data-export guarantees. Notice what this does to the economics of hedging. Most insurance protects against one scenario, so each risk costs its own premium. This one investment pays out under Current 3 (your vendor dies), under Current 4 (your vendor is severed), and partially under composite B as well (your vendor reprices the frontier and you want to move plateau work off it). It is the only place in this booklet where one hedge covers multiple currents at once, which makes it, euro for euro, the cheapest insurance on offer.

First moveRun one planned vendor-swap exercise this quarter: take a real workflow, swap the model behind it, and measure how long it takes until output quality is acceptable again. That number (days, weeks, or “we couldn’t do it”) is your honest, measured exposure to two currents at once. Most teams have never measured it.

Current 1 (Continued Scaling) × Current 7 (The Physical Substrate)

F. The Grid Ceiling

Watch forFrom Continued Scaling: capex guidance keeps rising >30% and the next generation stays on schedule, so the demand side stays healthy. From the Physical Substrate: a hyperscaler names power, interconnection, or permitting as the reason for a capex cut or a named-project delay, or a state enacts a data-center moratorium.

The interaction termScaling’s failure mode was always assumed to be scientific (the staircase flattens) or financial (Current 3’s funding drought). This composite is the third way, and it is stranger than either: the money is there, the science works, and the atoms don’t arrive. Training runs queue behind interconnection queues, and the clusters that were paid for in 2026 energize in 2029 instead of 2028. Notice what that does to the decoder: the whole instrument rests on capex carrying a reliable date stamp, and this composite is the one place where the date stamps themselves start slipping. Capability still arrives, late and unevenly by geography, because power is a local commodity. It also produces two-lane pricing for a different reason than composite B: a scarcity premium rather than a capability premium, and the headline (“frontier prices rise”) looks identical. The disambiguation table exists for exactly this confusion.

First moveAsk each strategic vendor one question at the next review: of your announced capacity, how much is energized, how much is permitted, and how much is announced? Weight their roadmap promises by that ratio, not by the press release.

Current 6 (Hours and Dollars) × Current 8 (Political Economy of Displacement)

G. The Backlash Curve

Watch forFrom Hours and Dollars: a measured automation rate crossing a publicly legible threshold (an RLI-style rate above 25%), or a named enterprise publishing audited agent ROI. From the Political Economy of Displacement: a displacement statute enacted in a G7 economy, a 10,000-worker action with AI as the stated issue, or the attribution dam breaking in official filings.

The interaction termThese two don’t simply add: Current 8’s response function is triggered by Current 6’s success and then feeds back into Current 6’s arithmetic. An agent-hour levy or mandatory displacement reporting changes the cost side of the hours-and-dollars math directly; a professional-licensure carve-out removes whole task categories from the addressable pool overnight. The trap mirrors composite D, but points the other way: the organizations that adopt fastest while attribution is voluntary accumulate the largest retroactive exposure for the day it becomes mandatory. The winners of this composite are the ones whose agent usage is instrumented, documented, and defensible when the rules arrive mid-game, and they are rarely the fastest adopters.

First moveInstrument agent usage now (hours, tokens, workflows touched, headcount deltas) and write the augment-versus-replace rationale down as you make each decision. If a reporting regime lands in 2027, compliance should be a query, not an archaeology project.

Running composites in practice. In your quarterly review, don’t stop at “did a trigger fire?” Ask the second question: did a pair fire within the same quarter, from independent events? The July 2026 logs are a live exercise in why that last clause matters. They contain a cluster of firings: Current 1 (Fable 5, June 9), Current 3 (the S-1, June 1), Current 4 (the recall, June 12), Current 6 (the RLI result, July 1). Run the one-event test before calling pairs: the launch and the recall are one event chain touching Currents 1, 2, and 4, so they carry one count, not three. But the S-1 was filed before the launch, and the RLI measurement is an independent instrument; those counts stand. That leaves two live pair-candidates: composite A (the capability step previewed while the IPO trigger armed) remains the one to track into 2027, and composite E got a live-fire rehearsal on June 12, when the severed-vendor drill stopped being hypothetical; the companion booklet is the book-length treatment of that day. And these seven composites are not a complete list; they are the seven most load-bearing for a typical enterprise. The two currents most specific to your organization may form a composite that matters more. Write that one yourself, in this same format: the trigger pair, the interaction term, the first move.

The stance. No current is “the answer.” The strongest planning position is the one that performs adequately under all eight, not the one that bets everything on whichever current seems most live this quarter. Lock to none; watch triggers in all; treat a single firing as a reason to reweight and a pair firing as the moment strategy actually changes; let currents go stale when their triggers haven’t fired in 18 months and replace them with what better describes the world you’re actually living in. That discipline is what this booklet exists to make routine.

Practice

How to Use This in Practice

If you take one thing from this booklet, take this: currents are useful only if you commit to a habit. The habit is foresee, watch triggers, adjust. The currents are scaffolding for that habit, not a forecast.

Pre-commit to triggers, not predictions

The most useful artifact in this booklet is the trigger list under each current. Far more than the prose or the synthesis, the triggers are what tell you something has shifted. Decide now, before the headlines, what would update you. “If frontier agent cost-per-hour drops below $40, I will pilot the displacement workflow.” “If GPU rental prices fall another 30%, I will renegotiate our vendor contract.” The point is to short-circuit the response time between observing a signal and acting on it.

Review on a cadence

Quarterly is probably right for most teams. Faster than that and you’re reading noise; slower and you miss real shifts. Each review, walk through the trigger list and ask: has any trigger fired? Has any disconfirming signal landed? What changed in our environment? Update your stance accordingly. The output of the review is rarely “we were wrong”; more often it’s “we should weight this current heavier than we did last quarter.” The green trigger-log panels in this edition are that review done in public: between the May and June editions alone, a capability trigger fired early in variant form (Current 1), an IPO trigger went live (Current 3), a court date slipped (Current 5), and a doubling cadence was confirmed (Current 6). Between June and July, the cadence accelerated: a kill switch was proven and metered back on (Current 4), displacement got a leaderboard (Current 6), the politics of power arrived (Current 7), and the vendors started pre-negotiating the backlash (Current 8). That is the cadence working; copy the format for your own currents.

Let currents go stale

If a current’s triggers haven’t fired in 18 months and its disconfirming evidence has been steadily accumulating, the current probably isn’t live anymore. Retire it. Replace it with one that better describes the world you’re actually living in. This booklet is a snapshot of mid-2026; by mid-2027 at least one of these currents will likely need replacing. That’s the system working, not failing.

Currents changelog: how this booklet has evolved

The retire-and-replace rule isn’t hypothetical; it has already run. For the record, and so you can score the method itself:

May 2026 edition: “Plateau + Regulation” was rebuilt into From Lab to Production (the plateau thesis wasn’t earning its keep, and regulation was demoted from headline to secondary force raising the deployment floor). “Agentic Acceleration” was rebuilt into Hours and Dollars: from a vibe about agents to two measurable units.

July 2026 edition: two currents added: The Physical Substrate (promoted from a background assumption after capacity auctions hit caps and moratorium bills reached a dozen statehouses) and The Political Economy of Displacement (split out of Hours and Dollars once the RLI made displacement legible enough to have politics). Plus a page this booklet used to refuse to write: the author’s own scoreable position, next.

What I think, and where I’d put my chips

I want to be precise about my view: I think this approach (currents plus triggers plus periodic adjustment) is the healthy way to navigate a technology that changes this fast. I do not think your strategy should bet on any one of the eight currents in this booklet; the discipline of holding several open simultaneously, watching what fires, and updating without ego is the actual skill worth building. But earlier editions used that stance to avoid stating any view at all, and I’ve come to see that as its own kind of dodge. My students ask me, fairly, what I actually expect to happen, and a booklet that preaches falsifiable triggers while its author declines to be falsifiable isn’t practicing what it teaches. So the method stays agnostic, and my opinion gets its own page: dated, with probabilities, and scored in public every edition. That page is next.

The Author’s Position

Where I’d Put My Chips
“Dated July 2026: score me next edition”

Everything on this page is opinion. It is the answer to the question the rest of the booklet deliberately refuses: what do I actually think will happen? The method doesn’t depend on any of it; if every bet below is wrong, the currents and triggers still work. But each bet carries a probability and a condition under which I’m wrong, and every future edition re-scores them in place, misses included. The same discipline the trigger logs impose on the currents, applied to me.

Bet 1 · by mid-2027 · ~85%

The step confirms. The full Mythos / Spud generation ships in generally available (non-gated) form and delivers the capability tier that Fable 5 previewed, on agent-economics metrics rather than just benchmark points: something like another doubling of RLI-class automation rates over the Opus 4.8 / GPT-5.5 cohort. From where I sit, this generation is ready, and it delivers on expectations.

Wrong if no full non-gated deployment of the generation exists by June 30, 2027, or the deployed generation lands within noise of Opus 4.8 on independent benchmarks, in which case the “preview” reading of Fable 5 was wrong and the staircase already flattened.

Bet 2 · by end of 2029 · ~70%

The 2029 generation lands heavy, and it is already financed. By the decoder’s own logic, the money being committed in 2026–27 is buying the generation after Mythos / Spud, arriving around 2029. I avoid the term AGI; I mean something more concrete: an LLM generation so capable that it materially impacts how research is done, how industries operate, and how economies allocate work. Operationalized: an RLI-style automation rate above 50%, or a frontier lab credibly documenting a material LLM contribution to a first-tier scientific result, or displacement visible in national labor statistics. The signals I weight: the research talent consolidating at the leading labs, what figures like Hassabis are saying out loud, and the fact that models are already contributing meaningfully to research in mid-2026.

Wrong if the end of 2029 arrives with automation rates plateaued below 30% and no documented research-grade contribution, in which case the staircase bent exactly where the skeptics said it would, and the 2026–27 capex bought refinement, not transformation.

Bet 3 · by end of 2027 · ~80%

More severance, not less. At least two further government-imposed severance events (designations, recalls, access-gates, export restrictions on models themselves) hit frontier AI. June 12 was a precedent, not a one-off. I expect the US bloc to play the economics of frontier AI the way the companion booklet describes: export the goods, keep the factory. This is the pessimistic half of my position, and I hold it about as firmly as the optimistic half above.

Wrong if June 12 stands alone through the end of 2027, with no further designation, recall, or access-gate on a frontier model, in which case the mercantilist reading over-indexed on one event and sovereignty deserves a lighter weight than I give it below.

Bet 4 · by end of 2028 · ~60%

The China tell fires. The first Chinese lab to hold a credible frontier-parity claim ships that flagship closed. People placing their hopes in Chinese open-weight models tracking the frontier indefinitely are, I think, mis-reading why those models are open. Openness is a position, not a principle: Meta held “open” as an identity until the economics turned, Mistral reframed open weights as a funnel, and a lab that believes it has taken the lead has every incentive to close. (This is the companion booklet’s Bet 5, and I’m adopting it here as my own.)

Wrong if the end of 2028 arrives and every Chinese flagship with a credible parity claim is still open-weight, in which case open-weight release survives winning positions better than the positional theory predicts, and the sovereignty picture is genuinely brighter than I believe.

The weights

If the four bets are point predictions, this table is the portfolio: where I’d allocate 100 units of monitoring attention across the eight currents this quarter. Equal weighting would be 12.5 each; the deviations are the opinion.

Current	Weight	vs. equal	Why
1 · Continued Scaling	20	Over	The step is real, previewed, and the next one is already financed. This is the current I’d least want to be surprised by.
2 · Efficiency Revolution	5	Under	Real, but running one tier below the frontier, and the frontier premium (Fable 5 at 2× Opus pricing) says the top lane holds. For the workloads that decide the next three years, this current follows rather than leads.
3 · Financial Correction	7	Under	The revenue curve bent the strong bear case, and capability delivering (Bet 1) keeps bending it. The IPO trigger stays armed, though; this weight rises the day a listing prices.
4 · Sovereignty	20	Over	June 12 proved the kill switch. I expect this to play out badly, toward mercantilism rather than détente. My most confident pessimism.
5 · From Lab to Production	16	Over	With capability delivering, the action moves downstream to deployment. This is also where my consulting clients actually live, quarter after quarter; the gap between what the model can do and what the organization ships is the story of 2026.
6 · Hours and Dollars	20	Over	The RLI made displacement measurable, and I feel the hours jump at my own desk (see the field note in Current 6). The doubling cadence at the fast end of assumptions is the most under-priced fact in the booklet.
7 · The Physical Substrate	7	Under	Slow until it isn’t, and adaptation has so far outrun the doom case. Taiwan stays the fattest tail risk on this list, but tail risks earn a tabletop exercise, not a fifth of my attention.
8 · Political Economy of Displacement	5	Under	Lags Current 6 by design. Watch and instrument rather than act; this weight doubles the first time an attribution dam breaks.

100 units of attention, July 2026 edition. The weights get re-argued every edition; the previous edition’s weights stay visible so drift is scoreable.

The scoring promise. Next edition, every bet above gets a status tag in place (fired, not yet, or wrong) using the same vocabulary as the trigger logs, and the weight table gets a “last edition” column. If I’m wrong, the wrongness stays on the page. That’s the price of asking you to take the probabilities seriously.

Interactive versions of all visualizations: demos.barcik.training
Full research and data: publications.barcik.training

Scenario Planningfor Generative AI

The Question

The Capex Decoder “Reading the future from money already spent”

The staircase pattern

How to use it from here

Continued Scaling “Does the staircase keep climbing?”

The bet, stated plainly

What is being built

The inference bet

Trigger signals: what to watch for

Implications by role

Efficiency Revolution “How Much Does GPT-4 Cost?”

The training cost freefall

The open-source convergence

Inference pricing in freefall

The April 2026 wave

Where does value go when the model is free?

Trigger signals: what to watch for

Implications by role

Financial Correction “Have We Seen This Before?”

Survivors vs. Casualties, then and now

The dot-com precedent

The bubble argument has matured

Amazon vs. Pets.com

AI investment has entered unprecedented territory

The revenue gap

Personal value is clear. Enterprise value is the open question.

Vendor concentration is the under-discussed risk

Why the parallel breaks, and why it might not matter

Trigger signals: what to watch for

Implications by role

Sovereignty “What if your vendor isn’t allowed to sell to you?”

Two collisions, one pattern

June 12: the collision generalized

The Chinese open-weight wave fills the gap

What this means for EU enterprises

Where this current continues

Trigger signals: what to watch for

Implications by role

From Lab to Production “What we learned from 2015, and what’s different now”

The 2015 parallel

What’s different this time

The benchmark-to-deployment gap, quantified

Regulation as a secondary force raising the floor

What teams that bridge the gap actually look like

Trigger signals: what to watch for

Implications by role

Hours and Dollars “The two units that will decide displacement”

Two units

Where the autonomy is today

The cost comparison

The METR data point

The Remote Labor Index: displacement gets a leaderboard

What the Fable 5 pricing does to the math

What this implies for capex

Trigger signals: what to watch for

Implications by role

The Physical Substrate “Can the atoms keep up with the bits?”

The bit-atom mismatch

Power: the queue is the moat

Chips: one island, one packager, three memory vendors

The politics of power arrived

Why this is a current, not a footnote to Current 1

Trigger signals: what to watch for

Implications by role

The Political Economy of Displacement “What happens when Current 6’s math starts working?”

The canaries stopped singing

Attribution laundering

The tax question arrives before the tax

Labor’s playbook already exists

Why this is a current, not a chapter of Current 6

Trigger signals: what to watch for

Implications by role

How the Currents Interact

How to Use This in Practice

Pre-commit to triggers, not predictions

Review on a cadence

Let currents go stale

What I think, and where I’d put my chips

Where I’d Put My Chips “Dated July 2026: score me next edition”

Scenario Planning
for Generative AI

The Capex Decoder
“Reading the future from money already spent”

Continued Scaling
“Does the staircase keep climbing?”

Efficiency Revolution
“How Much Does GPT-4 Cost?”

Financial Correction
“Have We Seen This Before?”

Sovereignty
“What if your vendor isn’t allowed to sell to you?”

From Lab to Production
“What we learned from 2015, and what’s different now”

Hours and Dollars
“The two units that will decide displacement”

The Physical Substrate
“Can the atoms keep up with the bits?”

The Political Economy of Displacement
“What happens when Current 6’s math starts working?”

Where I’d Put My Chips
“Dated July 2026: score me next edition”