Funding doubt threatens Australia's compute clout

Operators of Australia's top high-performance computing facilities have called for funding certainty to enable them to attract staff and keep expanding in size and scale.

A panel convened at the Australian Data Centre Strategy Summit saw top executives of the Pawsey Supercomputing Centre, National Computational Infrastructure (NCI) and the Bureau of Metereology lament the lack of "longevity" in funding arrangements for their facilities.

NCI's associate director of services and technology Allan Williams said the facility's future remained uncertain, despite a recent "refunding exercise".

Williams said NCI's collaboration partners - a mix of academic institutions and commercial firms - covered the "operational" costs of the NCI supercomputer, such as staffing and electricity.

But other costs, such as capital to grow, are covered by the Australian Government's National Collaborative Research Infrastructure Strategy (NCRIS), which has only just won a 12-month stay of execution. It had faced defunding before a Government backdown after pressure from the scientific community.

"Obviously it's very hard to keep staff engaged if they don't think they've got a job [mid- to long-term], but that's the financial reality," Williams said.

He said NCI's major partners had signed on to fund operational costs for another two years, but that had again simply "delayed" a harder decision about NCI's long-term future.

Pawsey executive director Neil Stringfellow also lamented the short-term thinking on funding arrangements, noting it contributed to a skills shortage in the sector.

"[If funding is] from a particular source where you've got to spend it in the next year or 18 months, you can't attract the sort of people you need to attract into these positions even if you could find them," he said.

"You can't attract people if you've only got 'spend this in 12 months or it's gone'."

Williams urged a rethink in the way high-performance computing (HPC) and supercomputing facilities are viewed.

"Instead of seeing HPC as a cost to the nation, use it as an opportunity to generate new industries and kickstart that innovation," he said, noting the facilities were currently underpinning research in areas as diverse as cancer cures and fluid dynamics, but that their role was largely masked and "behind-the-scenes".

"The economic benefits behind HPC are huge, and getting that message across and getting buy-in from the government and public to support this infrastructure is very important."

Staffing up

Without exception, all three operators spoke of plans to dramatically scale up their facilities in the next few years to meet advanced research and compute demands.

The Bureau of Metereology's chief technology officer Barry Nugent said he expected weather models to be about 150 times the size they are now within four years, and compute power needs to grow exponentially with that.

"The load we're seeing is between 40 to 80 kilowatts per rack," Nugent said.

NCI's Williams similarly said he expected his 50 or so racks of gear that currently run at "about 35-40 kilowatts a rack" would grow to run at "60-80 kilowatts a rack" in a next-generation system.

"We're moving into larger and larger scale systems," Williams said.

"The next machine we're talking about in 2-3 years is 10 petaFLOPS, but globally they're [already] talking about exaFLOPS which is a thousand times bigger.

"Finding staff that actually have the capabilities to support petaFLOPS [scale] systems ... is very rare so skills are always problematic. Getting those skills from somebody who wants to work at a university wage is even harder."

The number of moving parts in these huge systems means having skilled staff on-hand - for example - to replace failed components is increasingly critical. In addition, staff were also involved in optimising code to run on larger systems.

"Some of our staff are dedicated to running the machine but the rest are actually dedicated to optimising code - improving scaling so code can run faster and over more of the machine," he said.