Comment from Semperis: Facebook outage
October 2021 by Guido Grillenmeier, Chief Technologist, Semperis
This some comment below on the Facebook outage this week from Guido Grillenmeier, Chief Technologist, Semperis:
The fact that Facebook was unreachable and along with it the Facebook Messenger app, as well as WhatsApp and Instagram, was noticeable globally – for users of all ages.
But beyond the mentioned well-known apps – due to Facebook also acting as an Identity Provider for other apps, its outage had far-reaching consequences beyond their own services: any other application, typically web-apps that use the “Sign in with Facebook” option was affected and with this the business that relied on those identities.
It soon became clear that the culprit of the whole outage was related quite simply to the process of resolving the readable part of the website URLs, i.e. their domain name and related server names, to their respective IP addresses. The latter are required for the different machines to communicate with each other – while we humans use names, computers and apps use their IPs to connect with each other. But if the system that you rely on to resolve those names to IPs , i.e. the Domain Naming System, in short DNS, is malfunctioning, you are in trouble.
When the Facebook IT team were doing some updates, a configuration mistake led their systems to send out routing updates to other DNS servers around the globe accidentally telling them that facebook.com no longer exists – meaning that there were no DNS servers out there anymore that could resolve any names related to facebook.com to an IP address, which could be used to connect the different systems wanting to use the different Facebook services.
Your corporate business may not have been directly affected by this outage. However, many companies have started to use WhatsApp and Instagram to collaborate with their customers, for example to offer technical support. Other companies are strongly dependent as their users authenticate with their Facebook accounts – i.e. they utilize Facebook as an identity provider. Once facebook.com was no longer reachable, no new connections could be established.
It’s a horror scenario for any identity provider. Think of Azure AD no longer being available in the web – in that case it could only be fixed by Microsoft and not yourself. On a smaller scale, your local on-premises Active Directory has just as many dependencies on a properly configured DNS.
In any case, whether it’s from a DNS misconfiguration or for some other reason such as a direct cyberattack – if your identity system goes down, your business will be affected dramatically. So ideally you have validated your backups and have your incident respond plan ready to go.