SemiAnalysis Companion  —  Volume III  —  Visual Atlas
Q2 2026  —  Ten Plates
A field guide

The AI Compute Industry,
drawn to scale.

Ten diagrams that make the technical map and the beginner's companion legible at a glance. Each plate is a concept graph — nodes are things, edges are dependencies. Read them slowly; the industry is a graph, not a list.

Read alongside  /  SemiAnalysis_Knowledge_Map.md  ·  SemiAnalysis_Beginner_Companion.md

Figure 01  /  The Meta-Frame

The Three Bottlenecks.

Every AI-compute story is one of three constraints in disguise. They are not independent — relieving one usually loads another.

THREE BINDING CONSTRAINTS  ·  Q2 2026 Logic 01 THE BRAIN CHIPS Leading-edge wafer supply TSMC N3 · N2 · A16 Intel 18A · 14A (uncertain) KEY NUMBER ~90% TSMC share of leading-edge foundry wafers PUSHES ON → More HBM per die to extract more work per scarce wafer Memory 02 THE SCRATCH PAD HBM capacity & bandwidth SK Hynix · Samsung · Micron Gated by hybrid-bonding tools KEY NUMBER 3 suppliers control ~95% of DRAM bits worldwide PUSHES ON → Lower precision (FP8, FP4) and denser packaging (CoWoS-L) Power 03 THE ELECTRICITY Grid interconnect, onsite gas Utility + SMR + behind-meter Gigawatt-class buildings KEY NUMBER 3–7 yr waiting time for a new US grid interconnection PUSHES ON → Bigger single buildings and more compute per watt LOOP CO- DESIGN α LOOP THERMAL DENSITY β OUTER LOOP FLOPS ⇌ WATTS γ THE LAW OF CONSERVATION OF BOTTLENECKS Fixing any one pushes demand onto the other two. You never solve the industry — you only shift which constraint is binding this quarter. Read every news item by asking: which one is this really about?
Figure 01. Three-bottleneck frame (after Dylan Patel). Arrows show that pressure on any pillar reshapes the adjacent ones — co-design makes the system coupled.

Logic tells

"TSMC raises N3 wafer prices." "CoWoS capacity sold out through 2027." "Intel 18A yield update."

Memory tells

"SK Hynix HBM4 sold out." "Samsung still can't qualify HBM3E with Nvidia." "Hybrid-bonding tool shortage."

Power tells

"Microsoft signs Three Mile Island deal." "OpenAI orders 2.3 GW of gas turbines." "xAI trucks in generators."

Figure 02  /  The Map

The Sixteen-Layer Stack.

The spine of the industry, read upward from atoms to geopolitics. Everything in the technical map lives at one of these layers — and every layer constrains the one above it.

READ BOTTOM-UP  ·  EACH LAYER CONSTRAINS THE ONE ABOVE PHYSICAL SYSTEMS INFRASTRUCTURE ECONOMIC & POLITICAL 16 Geopolitics & Regulation Export controls · CHIPS Act · Taiwan risk 15 Business & Capital Markets Hyperscaler capex · Nvidia P&L · neoclouds 14 AI Models & Workloads Scaling laws · RL · training vs. inference 13 Software Stack CUDA · Triton · PyTorch · TensorRT-LLM 12 Power & Energy Grid · onsite gas · SMR · utility queue 11 Datacenter Infrastructure Electrical · liquid cooling · site selection 10 Systems, Servers & Rack Architecture NVL72 · Kyber · ODMs (Supermicro, Foxconn) 09 Networking & Interconnect NVLink · InfiniBand · Ethernet · CPO · OCS 08 Accelerators & Processors Nvidia · AMD · TPU · Trainium · MTIA · Ascend 07 Advanced Packaging CoWoS-S/R/L · SoIC · EMIB · Foveros 06 Memory — DRAM, HBM, NAND HBM4 · hybrid bonding · LPDDR · MRDIMM 05 Foundries & Fab Business Models TSMC · Samsung · Intel · SMIC 04 Wafer Fab Equipment (WFE) ASML · AMAT · LAM · TEL · KLA 03 Lithography & Patterning EUV Low-NA · High-NA · DSA · Sculpta 02 Process Technology & Nodes N3 · N2 · A16 · Intel 18A · 14A 01 Semiconductor Physics & Materials Transistors · BEOL · BSPDN · 2D channels CHOKEPOINT INTENSITY CRITICAL US-China flashpoint HIGH ~6 hyperscalers HIGH 10 frontier labs CRITICAL CUDA lock-in HIGH grid interconnect HIGH few GW builders HIGH ~5 ODMs CRITICAL NVLink / InfiniBand CRITICAL Nvidia ~90% CRITICAL TSMC CoWoS sole CRITICAL 3 HBM suppliers CRITICAL TSMC ~90% HIGH Big 5 (ASML lead) CRITICAL ASML EUV sole CRITICAL TSMC / Samsung LOW academic / diffuse FOUNDATION A single transistor, billions of times per chip. Everything above is an amplifier.
Figure 02. The stack as layered cake. Sections are the "gears" of the industry: physical substrate, system assembly, infrastructure, and the economic/political envelope that contains everything.
Modern AI hardware is co-designed across multiple layers at once. You cannot understand the chip without knowing what rack it goes in — and vice versa.

Figure 03  /  Layer 1 → 2

Anatomy of a Chip.

A cross-section of a modern leading-edge die. Signals come in from the top; power increasingly comes in from the back. Nine-tenths of the height is wire, not transistor.

CROSS-SECTION  ·  NOT TO SCALE  ·  SCHEMATIC BACKSIDE POWER METAL (PowerVia / Super Power Rail) SILICON SUBSTRATE (bulk Si) MICRO-BUMPS / Cu PILLARS → package BEOL — BACK-END-OF-LINE 15+ layers of metal wires. Cu at thick layers, Co / Ru / Mo at the tightest. Most of the chip's volume — and most of its signal delay. FEOL — TRANSISTORS Billions of GAA nanosheets. Planar → FinFET → Gate-All-Around. Stacks of horizontal channels with the gate wrapping all four sides. BSPDN — BACKSIDE POWER The recent big win. Grinds the back of the wafer, puts power rails underneath. ~6–8% perf at iso-power, ~20% less IR drop. Intel shipped first. SCALE REFERENCE Gate pitch on N2: ~45 nm. A human hair is ~70,000 nm wide. ~1,500 transistors across one hair. COVERED IN KNOWLEDGE MAP §1–§2
Figure 03. A leading-edge die, cross-section. Top: micro-bumps connecting up to the package. Middle: the BEOL wire stack and the transistors underneath. Bottom: the new backside power network.

Figure 04  /  Layer 7

Advanced Package Cross-Section.

A modern AI accelerator is not a chip — it is a package: compute dies, HBM stacks, and a silicon interposer all co-fabricated as one object. This is CoWoS, TSMC's engine of the AI buildout.

CoWoS-L PACKAGE — SCHEMATIC CROSS-SECTION ORGANIC PACKAGE SUBSTRATE RDL + EMBEDDED LSI BRIDGES (CoWoS-L)  ·   TSV-threaded silicon LOGIC BASE DIE HBM4 · 12-Hi Compute Die RETICLE-LIMITED TSMC N3P / N2 / A16 ~800 mm² NV-HBI Compute Die RETICLE-LIMITED Blackwell: 2× · Rubin: 2× ~800 mm² LOGIC BASE DIE HBM4 · 12-Hi 8 DRAM DIE + BASE >2 TB/s per stack Logic die (N3P/N2/A16) HBM DRAM layer Silicon interposer + LSI bridges Organic substrate NV-HBI die-to-die bridge
Figure 04. Cross-section of a Blackwell-/Rubin-class CoWoS-L package: two reticle-limited compute dies stitched through an NV-HBI bridge, flanked by 8–16 HBM stacks on a silicon/RDL interposer. The package is the accelerator.

The chokepoint

Each Rubin consumes ~2.5× the CoWoS interposer area of a Hopper. TSMC CoWoS capacity: 15k → 40k → 80k wpm (2022 → 2024 → 2026).

Hybrid bonding

At HBM4E, Cu-to-Cu direct bonds replace micro-bumps. <10 µm pitch enables thinner stacks and better thermals — gated by BESI/AMAT/Shibaura tool supply.

The HBM logic die

HBM4's new 2048-bit bus requires a logic-node (N5/N3) base die underneath the DRAM stack. HBM is now a logic-process product.

Figure 05  /  Layer 1 — Physics

Four Generations of Transistor.

The atomic-scale story of how the industry kept shrinking past the limits of each previous design — by redesigning the switch itself.

TRANSISTOR ARCHITECTURES  ·  FIVE DECADES OF SCALING 1970s–2011 2011–2022 2022→ ~2030 CHANNEL LEAKIER → → BETTER GATE CONTROL 01 · PLANAR Flat CMOS GATE S D Channel on surface. Leaks at small sizes. GATE SIDES 1 / 4 LEAKAGE NODE DEBUT ~45–22 nm 02 · FinFET Intel 22nm (2011) GATE Gate wraps three sides. Better control, less leak. GATE SIDES 3 / 4 LEAKAGE NODE DEBUT 22 nm → 3 nm 03 · GAA / NANOSHEET TSMC N2 (2025) GATE (all around) Stacked horizontal sheets. Gate wraps all four sides. GATE SIDES 4 / 4 LEAKAGE NODE DEBUT TSMC N2 · 14A 04 · CFET Research — ~2030 pFET nFET Stack n on p vertically. Reclaims lateral area. GATE SIDES 4 / 4 · ×2 LEAKAGE RESEARCH imec · Intel · TSMC Each generation is a response to the same problem: shorter channels leak. Better gate geometry fights back.
Figure 05. Four transistor architectures, chronological. The gate (copper) gains control over the channel (green) by surrounding more of it — flat, then three sides, then four, then finally stacked.

Figure 06  /  The Geography

The Global Supply Chain.

No country makes a modern AI chip alone. Each step is controlled by one-to-three suppliers, usually in a single country, and disrupting any of them cascades through the whole industry.

CONCENTRATION MAP  ·  NODES ARE COUNTRIES, EDGES ARE DEPENDENCIES United States DESIGN Fabless design, EDA, IP Nvidia · AMD · Apple · Broadcom Synopsys · Cadence · Siemens AMAT · LAM · KLA (WFE) ~85% of chip design value Netherlands LITHOGRAPHY EUV monopoly ASML — sole EUV maker Cymer (laser sources) Low-NA: $170M/tool · High-NA: $350M+ 100% of EUV production Japan MATERIALS Chemicals, wafers, tools JSR · TOK · Shin-Etsu (resist) Shin-Etsu, SUMCO (wafers) Tokyo Electron · Canon · Nikon ~60% of 300mm wafers South Korea MEMORY HBM leadership SK Hynix — HBM3/3E/4 leader Samsung Memory + Foundry 3D NAND + DRAM volume >50% of HBM bits Taiwan THE HEGEMON Leading-edge fab & packaging TSMC — fab + CoWoS UMC · PSMC · VIS ASE · Foxconn · Quanta · Wistron ~90% of leading-edge wafers China LOCAL ECOSYSTEM Trailing edge + ambition SMIC, HuaHong · CXMT (memory) SMEE · NAURA · AMEC (WFE) Huawei, Alibaba, ByteDance (design) Locked out of EUV & HBM FABLESS IP EUV SCANNERS RESIST · WAFERS HBM STACKS PACKAGED GPU WFE CO-DEV EXPORT CONTROL LEGEND primary dependency secondary / cross-tie SINGLE POINTS OF FAILURE ASML EUV — literally the only source. Strike or export-control action halts all <7nm production worldwide. TSMC N2/A16 in Hsinchu — no backup fab for leading-edge. Even Arizona Fab 21 is a node behind. SK Hynix HBM — Samsung yields still lag; Micron small. A quality escape at Hynix would stall every Nvidia/AMD roadmap.
Figure 06. Six-country concentration map. Each box shows what that geography contributes and how dominant its suppliers are. The arrows are the physical and commercial dependencies that route through Taiwan.

Figure 07  /  The Journey

From Sand to Datacenter.

A single GPU's journey through the industry, with the cumulative lead-time clock running in the margin. It takes the better part of a year to go from polished wafer to a running rack.

PIPELINE  ·  APPROXIMATE LEAD-TIMES, CUMULATIVE 01 SILICA SAND → SILICON WAFER Shin-Etsu / SUMCO grow boules Purify quartz to 9-nines (99.9999999%) silicon. Grow a single-crystal cylinder (Czochralski). Slice into 300 mm wafers, polish to atomic smoothness. ~2 weeks JAPAN · 60% share 02 WAFER FABRICATION (FEOL + BEOL) TSMC runs the wafer through ~1,000 steps 80+ lithography steps (a dozen EUV), deposition, etch, implant, planarization. Output: a wafer with ~60–70 AI-accelerator dies, each ~800 mm². ~3 months TAIWAN (TSMC Fab 18, 20) 03 HBM STACK ASSEMBLY SK Hynix stacks 12 DRAM dies + base die TSVs drilled through each DRAM die; stack bonded with micro-bumps or hybrid bonding. Logic base die (TSMC N5/N3) drives HBM4's new 2048-bit interface. ~2 months KOREA (Icheon, Cheongju) 04 ADVANCED PACKAGING (CoWoS-L) Compute dies + HBM onto interposer Two reticle-limited compute dies stitched (NV-HBI); 8–16 HBM stacks flanking them. The defining shortage of 2023–26. TSMC capacity: 15k → 80k wpm. ~1 month TAIWAN (TSMC AP1–AP6) 05 MODULE + SYSTEM ASSEMBLY Foxconn / Wistron / Quanta build trays GPU packages + Grace CPUs on SXM modules; compute trays with cold plates, NICs, PCBs. Rack integrator (Supermicro, Celestica) assembles into NVL72 form factor. ~4 weeks TAIWAN · MEXICO · US 06 DATACENTER INTEGRATION & BURN-IN Installed in a GW-class building, powered up Plugged into InfiniBand/Ethernet fabric, liquid-cooling loops, grid & on-site gas. Training run begins. Tokens out. ~2 weeks US · EU · ASIA CUMULATIVE LEAD-TIME FROM WAFER START ~6–9 months
Figure 07. End-to-end lead-time for a single GPU. Every week is a week that the datacenter above it cannot be filled. This is why forecasting error at any step propagates through hundreds of billions of dollars of capex.

Figure 08  /  Layer 10 — The Unit of Product

GB200 NVL72 Rack Architecture.

Post-Blackwell the rack is the product. Seventy-two GPUs stitched into one scale-up domain by a copper NVLink backplane, drawing ~120 kW and cooled by liquid. Rubin's Kyber rack will push this to 144 and then 576 GPUs — and go optical.

NVL72  ·  72× B200  ·  18 COMPUTE + 9 NVSWITCH TRAYS  ·  ~120 kW POWER SHELF — 6× PSU, 415V 3-phase busbar CT014× B200 · 2× Grace CT024× B200 · 2× Grace CT034× B200 · 2× Grace CT044× B200 · 2× Grace CT054× B200 · 2× Grace CT064× B200 · 2× Grace CT074× B200 · 2× Grace CT084× B200 · 2× Grace CT094× B200 · 2× Grace NVSWITCH TRAYS · ×9 5th-gen NVLink · 1.8 TB/s per GPU · 130 TB/s aggregate COPPER BACKPLANE — ~1.5 mi of cable CT104× B200 · 2× Grace CT114× B200 · 2× Grace CT124× B200 · 2× Grace CT134× B200 · 2× Grace CT144× B200 · 2× Grace CT154× B200 · 2× Grace CT164× B200 · 2× Grace CT174× B200 · 2× Grace CT184× B200 · 2× Grace CDU · LIQUID COOLING MANIFOLD (supply ↔ return) SCALE-UP DOMAIN 72 GPUs, one memory space All 72 B200s appear as one coherent domain. A tensor split across them talks at NVLink speed, not Ethernet speed. This is why the rack replaces the GPU as the unit of purchase. THE NUMBERS 72× Blackwell B200 GPUs 36× Grace (Arm) CPUs 18× Compute trays (SXM) 9× NVSwitch trays 72× ConnectX-7 NICs 72× 800G OSFP optics POWER & COOLING ~120 kW per rack, continuous ~140,000 BTU/hr of heat rejected Direct-to-chip liquid — air won't cut it Two-phase cooling on Rubin+ ECONOMICS ~$3M list price per rack ~1.5 miles of copper cable inside ~$10–12B annual revenue per GW WHAT COMES NEXT Vera Rubin — 144 GPUs, NVLink 6 Kyber / NVL576 — optical scale-up Projected rack draw: 250–600 kW
Figure 08. Schematic front view of a GB200 NVL72. Compute trays in navy, nine NVSwitch trays in the center forming the copper spine. Power on top, liquid cooling on bottom. Seventy-two GPUs talk as one.

Figure 09  /  Layer 9

Three Networks in an AI Cluster.

An AI cluster is not one network, it is three — at very different scales, speeds, and costs. The back-end alone accounts for ~85% of cluster networking spend.

THREE NETWORK DOMAINS  ·  INTRA-RACK · INTER-NODE · CONTROL PLANE 1. Scale-up INTRA-RACK · COPPER TODAY · OPTICAL TOMORROW NVSwitch GPUGPUGPUGPU GPUGPUGPUGPU TECHNOLOGIES NVLink · UALink · ICI · NeuronLink One coherent memory / scale-up domain BANDWIDTH 1.8 TB/s per GPU (NVLink 5) 2. Scale-out (Back-End) INTER-RACK · INFINIBAND OR ETHERNET R1R2R3R4 R5R6R7R8 R9 LEAFLEAFLEAF TECHNOLOGIES InfiniBand XDR · 400 / 800G Ethernet 1 NIC per GPU · optical pluggables / CPO COST ~85% of cluster network spend 3. Front-End (Control Plane) STORAGE · MANAGEMENT · EXTERNAL INGRESS EXTERNAL AGGREGATION HEAD NODES · BF-3 DPU WAN / INGRESS STORAGE TIER ORCHESTRATION TOR / LEAF Ethernet 100 GbE HEAD NODE 01 HEAD NODE 02 HEAD NODE 03 HEAD NODE 04 TECHNOLOGIES Conventional Ethernet · DPUs User ingress, storage IO, orchestration BANDWIDTH 100 GbE typical per node FAST SLOW
Figure 09. The three networks of a modern AI cluster, ordered top-to-bottom by bandwidth. Scale-up keeps GPUs close; scale-out stitches racks into a pod; the front-end is just cloud networking.

Figure 10  /  Layers 11 + 12

Datacenter Power & Cooling Stack.

From high-voltage utility feeder to transistor: roughly a dozen conversion stages, each shedding heat, each with its own supply chain. This is why "build a 1 GW datacenter" is a three-year project, not a three-month one.

ELECTRICAL STACK  ·  UTILITY → CHIP GRID 345 kV+ utility feeder ONSITE GAS turbines · RICE bypass interconnect NUCLEAR / SMR 2029+ realistic Oklo · X-Energy SUBSTATION HV → MV 34.5 kV TRANSFORMER MV → LV 480 V AC UPS double-conv Li-ion / flywheel PDU / BUSWAY 415 V 3-φ distribution SERVER PSU AC → 54 V DC ~97% eff. VRM / CHIP 54 V → 0.7 V DC 1000 A+ CONVERSION LOSS ~0.5% ~0.8% ~4–7% ~0.5% ~3% ~10% Σ ≈ 15–20% COOLING LOOP  ·  PARALLEL TO ELECTRICAL STACK COOLING TOWER evaporative reject to air CHILLER PLANT centrifugal 6–10 °C supply CDU coolant distribution RACK MANIFOLD supply ↔ return quick-disconnect COLD PLATE direct-to-chip ~85–95% capture PACKAGE 1200 W+ heat source ← HEAT FLOW PUE TARGET 1.1–1.2 liquid-cooled AI site SCALE REFERENCE  ·  RACK → CAMPUS NVL72 RACK 120 kW continuous IT load ~1.5 mi copper inside 100K H100 CLUSTER 150 MW critical IT power Rubin of same scale: ~300–400 MW FRONTIER CAMPUS 1 GW ≈ one large nuclear reactor xAI Colossus 2 · Stargate Abilene REVENUE PER GW $10–12B annual, at full utilization → every 6-mo delay = billions Every kilowatt in must come out as heat. Every conversion loses a few percent. Every gigawatt is a political project.
Figure 10. Energy flow (top row, left-to-right) and heat flow (middle row, right-to-left). Roughly 15–20% of every electron is lost to conversion. Every kilowatt delivered to silicon must be pulled back out as heat by the parallel liquid loop.
US power demand was flat for 15 years. AI broke that. Utility interconnection queues now stretch 3–7 years — which is why every hyperscaler is now also, reluctantly, an energy company.