Google back online after global outage

Google services are back online this morning after a global outage hit popular services like YouTube and Gmail on Monday night.

The nearly hour-long outage hit at around 10:45pm AEDT on Monday and lasted until just after 11.30pm.

YouTube, Gmail, Google Docs, and parts of the Google Cloud Platform were unavailable for users during the service disruption but Search was largely unaffected.

In a statement on the Google Cloud status dashboard, the company said the technical difficulties were caused by an account authentication error.

“The root cause was an issue in our automated quota management system which reduced capacity for Google's central identity management system, causing it to return errors globally,” Google said.

“As a result, we couldn’t verify that user requests were authenticated and served errors to our users.”

Users were initially kept in the dark – some quite literally with the outage stretching to smart home devices like lights – as the various social media accounts for Google’s array of services tried to deal with a sudden influx of complaints.

That doesn't sound good, Tejveer. Currently, there are no disruptions with Gmail. Could you try connecting to a different network to see if that works? Keep us posted.
— Gmail (@gmail) December 14, 2020

Internally, Google was suffering from “similar errors” with its tools which the company said caused problems when trying to communicate about the global error.

YouTube was the first service to get back up and running – no doubt a high priority given it is a major source of advertising revenue – but the rest of Google’s productivity suite soon followed, though minor issues continued to be triaged.

Google has not attributed the outage to a bad actor and said it would release a full report of the incident following its own investigations.

Update -- We’re back up and running! You should be able to access YouTube again and enjoy videos as normal https://t.co/NsGBvvaTko
— TeamYouTube (@TeamYouTube) December 14, 2020

This was the second major cloud disruption to happen in the past month, pointing to areas of fragility in cloud-based infrastructure.

A North American region of Amazon Web Services (AWS) fell down in November causing numerous websites and platforms reliant on AWS to go offline for hours.

In a post-mortem of the incident, AWS explained that a “relatively small addition of capacity” to some front-end servers caused the literally thousands of servers to overload their processors which busted their caches and stopped the flow of data.

During the multi-hour outage, users of internet-of-things devices like iRobot’s Roomba vaccum cleaner and the Amazon Ring doorbell discovered their devices were no longer fully functional.