Like many of my fellow travellers, I empathised with the thousands of British Airways passengers whose travel plans were thrown into chaos by that airline’s recent global systems outage.
This was a stark reminder of the massive impact ICT failures can have.
It follows a spate of similar incidents involving computer systems at Air France, JetStar, Lufthansa, Delta and Southwest airlines, not to mention last month’s passport system failure that caused lengthy delays at international airports in Australia and New Zealand. All of which raises questions about our reliance on technology and the quality of the systems on which we depend.
While none of these failures was life-threatening – since they involved applications for scheduling, passport processing, baggage allocation and other logistical functions – what if a similar failure were to occur in a safety-critical system?
Last month I read the terrifying story of Captain Keith “Sully” Sullivan, who was piloting Qantas Flight 72 from Singapore to Perth in October 2008 when the flight system “went psycho” and nearly crashed the plane.
A former US Navy Top Gun fighter pilot, Sully only recently consented to be interviewed about the incident in which the primary flight control systems crashed after the plane experienced ten simultaneous failures.
The Airbus A330 dropped 150 feet in two seconds, the first of two nosedives that tossed cabin crew, passengers and luggage around the cabin and left scores of people seriously injured.
A three-year investigation by the Australian Transport Safety Bureau revealed that one of the plane’s air-data computers malfunctioned, sending incorrect data to the flight control systems, but it failed to identify what triggered that event.
Fortunately, Sully completed an emergency landing and all 315 souls survived, although the physical and emotional cost for many has been great, and group action continues in the US against the aerospace companies involved.
Others have not been so lucky. Just eight months after the QF72 incident, 228 people died when an Air France 330 jet crashed into the Atlantic Ocean after incorrect speed data was sent to the computer and the autopilot disconnected, causing the pilots to react to the false information and accidentally stall the plane.
Of course, aviation systems are just one of many types of safety critical systems, which automate key functions in sectors such as medical, energy, mining, construction, infrastructure, defence, chemical processing, transport and more. Soon, could this be even your car or taxi driver?
I was fortunate to attend the Australian System Safety Conference (ASSC2017) in Sydney earlier this month, where systems engineers and safety critical experts discussed the need for greater integrity in testing and accountability to maintain quality standards.
Organised by the Australian Safety Critical Systems Association (ASCSA), an ACS Special Interest Group (SIG), the conference heard a fascinating talk by Brisbane-based consultant Les Chambers on the risks – to life, property, reputation and even one’s soul – of poor integrity in software projects.
Sharing frankly from personal experiences where his own professional integrity buckled under pressure, Chambers challenged his audience to consider lessons learned from the Chernobyl nuclear accident and Bhopal gas tragedy.
“Engineers who do not study and reflect on the meaning of integrity and do not have their integrity regularly tested, even in small ways, will not act with integrity in a crisis, as we have seen at VW, Chernobyl, Bhopal and on and on,” he said.
“Indeed, as a high level corporate initiative, integrity must be looking good to VW right now. Their lack of integrity will cost them more than ten billion dollars in the years to come. Union Carbide has no cause to reflect as it no longer exists. The Bhopal disaster destroyed it along with thousands of innocent lives.”
Chambers said disasters involving failed technology could only occur if the professionals who knew what would happen held back from taking preventative action. He encouraged delegates to develop their ability to “speak truth to power” – to challenge incompetence, rationalisations and commercial expediency, and to instil integrity in everything they do.
Research has found that human factors account for 90 per cent of systems failures. Thus, the integrity of the systems on which we rely is a direct reflection of the integrity of those who build and maintain these systems.
The ACS is currently working on a certification process for Australian safety critical systems professionals.
Created by a team of Australia’s leading safety critical systems experts, the Certified Professional Safety Critical Systems will be the first in a series of specialisms built by ACS.
Please send comments or queries to committee@ascsa.org.au