Research
Datacenter energy as the binding constraint on computing
Why power availability — not chips or capital — is becoming the limiting factor for AI-era infrastructure, and what it means for architecture decisions today.
Executive summary
Datacenter energy has shifted from an operating cost line to the binding constraint on computing growth: grid interconnection queues now run years, AI training and inference concentrate unprecedented power density into single campuses, and siting decisions are increasingly made by megawatt availability rather than fiber or tax policy. This research program tracks the constraint from an infrastructure architect's seat — what it does to capacity planning, workload placement, and the long bet on new generation, including the fusion timeline I have written about publicly.
Why I am tracking this
For most of my career, power was the boring row of the capacity plan. You signed a colo contract or a cloud commit, and electricity was somebody else’s solved problem. That assumption is now visibly breaking: on accounts I work with and in the operator community around Nubinity, conversations that used to be about rack counts are now about megawatt counts, and the answer to “when can we have it” increasingly comes from the utility, not the vendor.
I also have a long-standing personal interest in the generation side — my fusion energy essay argued we are optimizing the wrong constraints in that program. This research thread is where the two interests meet: what does an energy-bound computing industry mean for the people who architect and operate the systems?
The shape of the constraint
Three observations anchor the program, each checkable against public data:
- Demand is compounding. The IEA projects datacenter electricity consumption roughly doubling between 2022 and 2026, with AI the largest growth driver. Training clusters concentrate load in a way traditional datacenters never did — single campuses requesting hundreds of megawatts.
- Supply is queue-bound. LBNL’s interconnection data shows thousands of gigawatts of generation waiting years in US queues. Even where generation exists, transmission approvals move on decade timescales. Money does not compress these queues much; process reform might.
- Siting has inverted. Fiber and latency used to pick datacenter locations. Increasingly, available firm power picks them, and network architecture has to follow — which changes WAN design, data gravity, and cloud-versus-on-prem math in ways most capacity plans haven’t caught up with.
There is a fourth observation hiding inside the first: density. A traditional enterprise rack draws single-digit kilowatts; AI training racks are being specified at an order of magnitude more, with liquid cooling as a requirement rather than an exotic option. That does not just strain the grid — it strands existing facilities. A datacenter with megawatts of contracted power can still be the wrong datacenter if its floor cannot carry the per-rack load, and retrofit economics are unforgiving.
Method
This is practitioner research, not a literature review. The program:
- Track public interconnection-queue and utility-filing data for the major US datacenter markets, quarterly.
- Maintain a running model of what energy-bound growth implies for the price and availability of the compute tiers I actually buy — colo, dedicated, and cloud reservations.
- Interview operators (hosting peers, colo providers, one utility-side contact so far) about what large-load customers are being told off the record about timelines.
- Test energy-aware scheduling ideas in my own lab and Nubinity’s environment at small scale — where “small” makes the physics honest: even a rack teaches you about power factor, cooling overhead, and work-per-watt measurement.
Early findings
Work-per-watt is not yet an engineering KPI, and it will have to be. Almost no team I’ve asked can state the energy cost of their workload’s unit of work (per request, per training step, per report). The instrumentation exists — power data at the PDU and increasingly per-server via BMC, workload data in the observability stack — but nobody joins the two datasets. That join is cheap and I think it becomes standard within a few years, the way cost-per-request became standard after cloud billing made waste visible.
Schedulability is the underpriced architectural property. The grid increasingly pays (or discounts) for flexible load. Workloads architected so that batch and training components can shift in time and place can chase cheap energy; monoliths cannot. This is the same portability argument that multi-region resilience makes, arriving from an unexpected direction — the designs converge.
The generation bets are barbelled. Hyperscaler behavior is the tell: contracting fission restarts and SMRs (firm, 2030s) while placing venture bets on fusion (upside, uncommitted timelines). My read of fusion hasn’t changed since I wrote about it: the physics milestones are real, the engineering-to-grid path is longer than the press releases imply, and no infrastructure plan should have it on the critical path.
What this changes in practice
The findings above are observations. These are the planning changes I am actually making, and recommending, as a result:
- Power questions move to the front of procurement. When evaluating a colo or a region, the first questions are now about committed versus merely contracted power, utility headroom, and what happens to my expansion rights when the facility’s allocation runs out. Ten years of hosting operations taught me that the constraint you discover at renewal time is the one you pay list price for.
- The planning horizon splits. Hardware lead times are quarters; power lead times in constrained markets are years. Capacity plans that treat them as one timeline will be wrong in the expensive direction. The utility conversation now has to start before the architecture is final, which inverts the order most organizations are used to.
- Schedulable load gets designed in, not bolted on. Separating latency-critical serving from batch and training work is no longer just a reliability nicety — it is the property that lets you place flexible load where and when energy is available. I treat it as an architectural requirement in new designs, at any scale.
- Energy joins the risk register. Not as a sustainability line, but as an availability and cost risk with an owner and a review cadence, the same as any other single-supplier dependency.
Open questions
- Does inference demand flatten (efficiency gains, smaller models) or does Jevons’ paradox hold and total consumption keeps compounding regardless?
- At what price differential does workload portability across regions actually get exercised, given data-gravity and sovereignty friction?
- Do energy constraints re-localize computing — a renaissance for well-sited on-prem and regional colo — or further concentrate it with the few players who can buy generation outright?
The larger point is one infrastructure keeps re-teaching: the binding constraint is rarely the layer everyone is optimizing. The industry is optimizing chips and model architectures while the schedule is being set by interconnection queues and transformer lead times. Computing has always ultimately been a physical discipline — energy in, heat out — and the AI era is simply making the physics visible again.
If you operate infrastructure and are seeing the constraint from another angle — especially utility-side or colo-side — I compare notes gladly via the contact page. This write-up is revised as the data moves.
Frequently asked questions
- Why is power now the limiting factor for datacenters?
- Demand from AI workloads is growing faster than grids can add generation and transmission. Utility interconnection requests for large loads face multi-year queues in major markets, so even a fully funded, fully permitted datacenter can sit waiting for megawatts. Compute has become easier to procure than the electricity to run it.
- What can infrastructure architects actually do about energy constraints?
- Treat power as a first-class architectural input: design workloads to be portable between regions with different energy availability, separate latency-critical serving from schedulable batch/training work, and measure work-per-watt as an engineering KPI. Teams that can move flexible load to where and when energy is cheap hold a structural cost advantage.
- Is nuclear or fusion realistically part of the datacenter energy answer?
- Fission — including planned SMR deployments and restarts contracted by hyperscalers — is a credible 2030s contributor. Fusion remains a long-horizon bet: enormously promising, repeatedly underestimated in difficulty, and not something any capacity plan this decade should depend on. Plan on efficiency, storage, and firm fission; treat fusion as upside.