Cooling, Power, Scale: What it really takes to support AI in the Data Centre
As more businesses roll out artificial intelligence (AI) into live operations, the focus is beginning to shift away from model accuracy and towards infrastructure viability. In many cases, what's slowing things down isn't strategy or ambition - it's thermal limits, grid capacity, and legacy infrastructure that was never built with AI in mind.
These aren't edge-case challenges affecting a few hyperscalers. They're affecting colocation sites, enterprise facilities, and regional data centres across Europe. Teams are having to re-evaluate how power is delivered, how heat is managed, and how much density a rack can realistically support before resilience starts to break down.
Power demand is climbing faster than capacity
Supporting AI workloads at scale means significantly increasing the amount of electrical power available per rack - and doing so without compromising uptime. The shift from central processing units (CPUs) to graphics processing units (GPUs), cloud tensor processing units (TPUs) and other accelerators is pushing rack-level power consumption far beyond historical averages.
It's not unusual now to see racks drawing 30kW or more, with many operators preparing for 80kW and upwards in AI-specific zones, and forecasts indicate this may increase to 300-600kW and possible 1MW by 2030. At these levels, everything upstream of the rack has to be revisited: switchgear, power distribution units (PDUs), uninterruptible power supplies (UPSs), cooling infrastructure and even room layout and cable runs.
But raw power delivery isn't the whole story. AI training workloads can draw this power in short, sharp bursts. Inference loads, meanwhile, may run continuously for hours at a time. That variability makes power planning more complex, especially when legacy infrastructure is involved.
For data centre operators, the priority is no longer to just deliver more power - it's delivering it with precision and resilience, in systems that can tolerate sudden changes without tripping or overloading.
Cooling infrastructure is having to adapt quickly
With higher power draw comes higher heat output. Traditional air-cooling systems - based on hot aisle / cold aisle containment and raised floor distribution - are being stretched to their limits in high-density environments.
Facilities are now starting to incorporate liquid cooling to integrate air systems, particularly in AI training halls or retrofitted areas. Direct-to-chip cooling, where fluid runs directly to the heat-generating component, is currently leading adoption. It offers improved efficiency and allows for denser hardware installation without a proportional increase in footprint.
Immersion cooling is also being trialled in edge scenarios and research settings, though it brings operational complexity that many commercial sites are still working through.
Operators are learning that cooling for AI is no longer about broad thermal management control. It's now a question of pinpoint accuracy - keeping small zones thermally stable under volatile load, while optimising overall energy use to stay within carbon footprint targets.
Scaling up is about more than space
When businesses talk about scaling AI infrastructure, they often mean adding more racks, more servers, more capacity. But physical space is rarely the limiting factor. Power availability, cooling headroom, and grid connectivity are much more likely to dictate the pace of expansion.
In many UK regions, new grid connections can take years to secure. This is delaying new builds, but it's also affecting how quickly existing sites can scale. Even in large facilities, operators are phasing deployments based on available power rather than floor plan capacity.
Some are mitigating this by deploying modular power systems, on-site generation, or battery energy storage systems (BESS). Others are zoning their facilities so AI-specific infrastructure is concentrated in areas that have been reinforced for high density and thermal load.
The lesson here is clear: scale isn't about how much hardware you can fit into a room. It's about how effectively your infrastructure can support the performance envelope of next-generation workloads.
Commissioning is no longer routine
With these changes, commissioning becomes much more than a sign-off stage. It's a critical opportunity to understand how systems behave under real conditions, especially when dealing with AI load profiles that are unlike anything in traditional enterprise IT.
Operators are now simulating burst loads, validating failover across cooling and power systems, and assessing interoperability between legacy and new infrastructure. Digital twin models and thermal simulations are also being used to predict hotspots and airflow behaviour before equipment goes in.
Commissioning teams are increasingly multidisciplinary - bringing together mechanical, electrical and IT specialists for complete end-to-end performance. The most successful teams aren't just validating specs; they're uncovering weaknesses before workloads ever go live.
Preparing for what's next
AI isn't a passing trend, and the infrastructure being built now will need to support not just current-generation workloads, but whatever comes next. That's prompting a shift away from fixed-capacity design towards modular, scalable systems that can evolve.
Some operators are designing dedicated AI zones within existing data halls. Others are standing up new facilities optimised for high density from day one. In both cases, success depends on close coordination between cooling, power and compute - ideally with visibility across all three.