SemiAnalysis Companion — Volume III — Visual Atlas

Q2 2026 — Ten Plates

A field guide

The AI Compute Industry,
drawn to scale.

Ten diagrams that make the technical map and the beginner's companion legible at a glance. Each plate is a concept graph — nodes are things, edges are dependencies. Read them slowly; the industry is a graph, not a list.

Read alongside / SemiAnalysis_Knowledge_Map.md · SemiAnalysis_Beginner_Companion.md

Figure 01 / The Meta-Frame

The Three Bottlenecks.

Every AI-compute story is one of three constraints in disguise. They are not independent — relieving one usually loads another.

Figure 01. Three-bottleneck frame (after Dylan Patel). Arrows show that pressure on any pillar reshapes the adjacent ones — co-design makes the system coupled.

Logic tells

"TSMC raises N3 wafer prices." "CoWoS capacity sold out through 2027." "Intel 18A yield update."

Memory tells

"SK Hynix HBM4 sold out." "Samsung still can't qualify HBM3E with Nvidia." "Hybrid-bonding tool shortage."

Power tells

"Microsoft signs Three Mile Island deal." "OpenAI orders 2.3 GW of gas turbines." "xAI trucks in generators."

Figure 02 / The Map

The Sixteen-Layer Stack.

The spine of the industry, read upward from atoms to geopolitics. Everything in the technical map lives at one of these layers — and every layer constrains the one above it.

Figure 02. The stack as layered cake. Sections are the "gears" of the industry: physical substrate, system assembly, infrastructure, and the economic/political envelope that contains everything.

Modern AI hardware is co-designed across multiple layers at once. You cannot understand the chip without knowing what rack it goes in — and vice versa.

Figure 03 / Layer 1 → 2

Anatomy of a Chip.

A cross-section of a modern leading-edge die. Signals come in from the top; power increasingly comes in from the back. Nine-tenths of the height is wire, not transistor.

Figure 03. A leading-edge die, cross-section. Top: micro-bumps connecting up to the package. Middle: the BEOL wire stack and the transistors underneath. Bottom: the new backside power network.

Figure 04 / Layer 7

Advanced Package Cross-Section.

A modern AI accelerator is not a chip — it is a package: compute dies, HBM stacks, and a silicon interposer all co-fabricated as one object. This is CoWoS, TSMC's engine of the AI buildout.

Figure 04. Cross-section of a Blackwell-/Rubin-class CoWoS-L package: two reticle-limited compute dies stitched through an NV-HBI bridge, flanked by 8–16 HBM stacks on a silicon/RDL interposer. The package is the accelerator.

The chokepoint

Each Rubin consumes ~2.5× the CoWoS interposer area of a Hopper. TSMC CoWoS capacity: 15k → 40k → 80k wpm (2022 → 2024 → 2026).

Hybrid bonding

At HBM4E, Cu-to-Cu direct bonds replace micro-bumps. <10 µm pitch enables thinner stacks and better thermals — gated by BESI/AMAT/Shibaura tool supply.

The HBM logic die

HBM4's new 2048-bit bus requires a logic-node (N5/N3) base die underneath the DRAM stack. HBM is now a logic-process product.

Figure 05 / Layer 1 — Physics

Four Generations of Transistor.

The atomic-scale story of how the industry kept shrinking past the limits of each previous design — by redesigning the switch itself.

Figure 05. Four transistor architectures, chronological. The gate (copper) gains control over the channel (green) by surrounding more of it — flat, then three sides, then four, then finally stacked.

Figure 06 / The Geography

The Global Supply Chain.

No country makes a modern AI chip alone. Each step is controlled by one-to-three suppliers, usually in a single country, and disrupting any of them cascades through the whole industry.

Figure 06. Six-country concentration map. Each box shows what that geography contributes and how dominant its suppliers are. The arrows are the physical and commercial dependencies that route through Taiwan.

Figure 07 / The Journey

From Sand to Datacenter.

A single GPU's journey through the industry, with the cumulative lead-time clock running in the margin. It takes the better part of a year to go from polished wafer to a running rack.

Figure 07. End-to-end lead-time for a single GPU. Every week is a week that the datacenter above it cannot be filled. This is why forecasting error at any step propagates through hundreds of billions of dollars of capex.

Figure 08 / Layer 10 — The Unit of Product

GB200 NVL72 Rack Architecture.

Post-Blackwell the rack is the product. Seventy-two GPUs stitched into one scale-up domain by a copper NVLink backplane, drawing ~120 kW and cooled by liquid. Rubin's Kyber rack will push this to 144 and then 576 GPUs — and go optical.

Figure 08. Schematic front view of a GB200 NVL72. Compute trays in navy, nine NVSwitch trays in the center forming the copper spine. Power on top, liquid cooling on bottom. Seventy-two GPUs talk as one.

Figure 09 / Layer 9

Three Networks in an AI Cluster.

An AI cluster is not one network, it is three — at very different scales, speeds, and costs. The back-end alone accounts for ~85% of cluster networking spend.

Figure 09. The three networks of a modern AI cluster, ordered top-to-bottom by bandwidth. Scale-up keeps GPUs close; scale-out stitches racks into a pod; the front-end is just cloud networking.

Figure 10 / Layers 11 + 12

Datacenter Power & Cooling Stack.

From high-voltage utility feeder to transistor: roughly a dozen conversion stages, each shedding heat, each with its own supply chain. This is why "build a 1 GW datacenter" is a three-year project, not a three-month one.

Figure 10. Energy flow (top row, left-to-right) and heat flow (middle row, right-to-left). Roughly 15–20% of every electron is lost to conversion. Every kilowatt delivered to silicon must be pulled back out as heat by the parallel liquid loop.

US power demand was flat for 15 years. AI broke that. Utility interconnection queues now stretch 3–7 years — which is why every hyperscaler is now also, reluctantly, an energy company.