A massive outage at Amazon Web Services (AWS) on Monday caused widespread disruptions to numerous popular websites, apps, and online services used by millions of people around the world. Although service was fully restored by the end of the day, details about exactly why AWS failed, knocking everything from school service provider Canvas to popular video game Fortnite offline, remain sparse.
On Monday, a major outage at Amazon Web Services (AWS), a cloud platform that supports a substantial portion of the internet, resulted in the temporary shutdown of many apps, websites, and online tools relied upon by millions of users globally. The hours-long breakdown of the cloud system revealed the extent to which modern life depends on this infrastructure, affecting everything from banking apps and airlines to smart home devices and gaming platforms. Full service was not fully restored to all AWS customers until 3:01 p.m. ET according to Amazon’s status updates.
The root cause of the outage was traced back to an error in a technical update to the API of DynamoDB, a key cloud database service that stores user information and other critical data for numerous online platforms. The update issue affected the Domain Name System (DNS), which helps apps locate the correct server addresses. As a result, apps were unable to find the IP address for DynamoDB’s API and could not establish a connection.
DNS is one of the foundational technologies of the internet. It is often likened to a “phone book” for internet servers, translating addresses we type like amazon.com into IP addresses used to connect us to the proper server. If the DNS is the phone book, then the glitch in AWS’s DynamoDB would be a phone book that lists the wrong number for every person and business — resulting in a ton of failed connections.
The outage, which began at approximately 03:11 ET, originated in one of AWS’s primary data centers in Virginia, its oldest and largest site. As DynamoDB went offline, other AWS services also began to fail, with a total of 113 services ultimately impacted. Despite the DNS problem being resolved fairly quickly, many apps and services remained offline most of the day. This is likely because although Amazon reported that all AWS services had returned to normal operation, a backlog of messages remained to be processed over the following hours.
The extensive list of affected services included popular communication apps such as WhatsApp, Signal, Zoom, and Slack; gaming platforms like Roblox, Fortnite, and Xbox; and companies such as Starbucks and Etsy. Financial apps, including Venmo, also experienced issues, while some users reported that their Ring doorbells and Alexa speakers had stopped functioning. Several media organizations, such as the Associated Press, the New York Times, and the Wall Street Journal, were also impacted by the outage.
AWS, the world’s largest cloud service provider, allows companies to rent computing power and storage, supplying the technology that runs websites, apps, and many online services behind the scenes. While cloud outages are not uncommon, they have become more noticeable as an increasing number of companies rely on these services daily.
AWS has stated that it would publish a detailed post-event summary explaining the incident, but has not done so yet. Many questions remain about the incident, such as if a “brain drain” of experienced employees replaced by H-1B visa workers contributed to the massive failure, or if the company failed to design redundant systems to protect from failures.
Breitbart News will continue to report on this story.
Lucas Nolan is a reporter for Breitbart News covering issues of free speech and online censorship.