An Australian tool has cut the amount of computing power generative AI (genAI) requires by two thirds, offering promise for a fast-growing industry that faces tough decisions about how to overcome power shortages that could restrict 40 per cent of AI data centres by 2027.
The ICT industry’s pressure on global power grids is proving challenging, with Australia’s more than 200 data centres already consuming an estimated 5 per cent of the country’s electricity and AI-focused data centres expected to push this to 8 percent of supply by the end of the decade.
The complex calculations of genAI decoders require ten or more times as much power as conventional data queries – 3 watts per genAI query, compared with 0.3W for a Google search – and increasingly complex genAI models are likely to increase this.
Noting that the amount of computational power needed to sustain AI’s growth is doubling every 100 days, the World Economic Forum warns that training OpenAI’s GPT-3 genAI system used around 1300MWh of electricity, while its successor GPT-4 required 50 times as much power.
This poses real challenges to power networks, particularly given new projections from consultancy TM Advisory that indicate Australian firms will invest $26 billion to more than double local data centre capacity by 2030.
This is expected to create 9,600 new jobs, and an estimated 45 per cent of data centre operators turning to renewable energy.
Yet with 80 per cent of data centre operators struggling to get enough critical equipment to meet surging AI demand, another consultancy Turner and Townsend offers a warning in its latest annual Data Centre Cost Index that genAI must also get more efficient.
New solutions for new problems
Even as firms like chip maker Nvidia find new ways to make powerful and power-hungry graphics processing units (GPUs) even more so, scientists have considered ways that power-efficient technologies like neuromorphic computing might offer cooler, more efficient alternatives.
Canberra based AI specialist Trellis Data took a different approach by analysing the problem in software – with its new Dynamic Depth Decoding (D3) technology using speculative decoding to boost average genAI speed by 44 per cent compared to the popular EAGLE-2.
This speed increase – which the company says has allowed it to develop the world’s fastest genAI decoder – translates to what CEO Michael Gately said is a 68.4 per cent reduction in the amount of power required to run what he called “very [computationally] greedy” AI models.
“The right model for the right purpose delivers really good value,” he told Information Age, “and what better way to alleviate some of the consequential impacts of AI on the environment than by making them faster?”
“It’s about having some level of responsibility for the impact of delivering outcomes to our customers.”
Trellis Data’s customers – many of whom are confidential government agencies – “need stuff that’s deployable and fully disconnected from the Internet,” Gately said, “and still able to deliver the benefits that AI has promised on a global scale.”
“The ability to have servers that are smaller or less in number to be able to deliver those outcomes is really important, and it just means that AI can work in new places.”
Slowing down the flood
Whatever improvements can be achieved in software, however, the genAI industry must still come to grips with the fact that the breakneck pace of development has pushed it well past the time-tested realm of Moore’s Law, that observation that computing power doubles every 18 months.
As long as chipmakers could continue staying ahead of this curve by developing ever more powerful computers, conventional wisdom held, technological progress was sustainable even as cloud-based systems and data centres consolidated massive amounts of computing power.
Cryptocurrency tested that model by driving a global surge in power consumption that outflanked entire countries – but even that, Gartner vice president and distinguished analyst Jorge Lopez told Information Age, has been outpaced by genAI whose power needs “are really changing the whole discussion.”
“No one expected how much these supercomputing data centre facilities would actually consume in power,” he said.
“People were afraid of how energy intensive crypto mining was – but AI is at a whole different level.”
“We used to be able to rely on Moore’s Law to provide the price-performance to stay ahead of the complexities of software, but AI is forcing semiconductors to consider that what we thought about Moore’s Law needs to be rethought.”
Surging power consumption is likely to challenge existing energy grids that have used largely the same technologies for decades, he said amidst Garter predictions that data centres will reach 500 terawatt-hours (TWh) by 2027 – 2.6 times as in 2023.
Despite best efforts to stay ahead of AI’s power demands by upgrading infrastructure and optimising the technology itself, however, Gartner has warned that 40 per cent of AI data centres will hit the limits of their capabilities by 2027 because they can’t get enough power.
A new deal by Microsoft – which recently signed a 20-year deal to revive the Three Mile Island nuclear reactor that in the 1970s was the site of America’s worst ever nuclear disaster – hints at growing acceptance that nuclear power is the only way to meet AI’s long-term energy demands.
“To get to where we think we’re going, we really need to make some major considerations on the infrastructure,” Lopez said.
“It’s important that we at least have a belief that innovations will begin to cut into the consumption part – but this could be years away.”
“We don’t have the answers to this yet.”