ADVERTISEMENT
Microsoft has restored its Azure cloud services following a major global outage that disrupted operations across multiple regions and affected platforms including Microsoft 365, Outlook, Minecraft, and Xbox Live.
The outage, which occurred between 3:45pm UTC (9:15pm IST) on October 29 and 12:05am UTC (5:35am IST) on October 30, also impacted services such as Alaska Airlines and Starbucks, along with numerous businesses dependent on Microsoft’s Azure Front Door (AFD) — the company’s global content delivery network.
What Caused the Outage
According to Microsoft, the disruption was triggered by an inadvertent tenant configuration change within Azure Front Door. This misconfiguration caused a large number of AFD nodes to fail to load correctly, creating latency spikes, timeouts, and connection errors across various services.
The issue stemmed from a faulty deployment process that bypassed existing safety mechanisms designed to block invalid configurations. Microsoft admitted that a software defect in its validation system allowed the problematic deployment to proceed, causing widespread service interruptions.
Which Services Were Affected
Among the services impacted were Azure App Service, Azure Communication Services, Azure Virtual Desktop, Microsoft Defender External Attack Surface Management, Microsoft Purview, and Microsoft Sentinel.
Monitoring platform Downdetector recorded over 16,000 user reports at the peak of the outage around 9:47pm IST.
How Microsoft Fixed the Problem
In response, Microsoft blocked further configuration changes to halt the spread of the faulty state and re-deployed a previously stable configuration across its global network. The recovery process involved gradual traffic rebalancing to avoid system overload as services were restored.
“This deliberate, phased recovery was necessary to stabilise the system while restoring scale and ensuring no recurrence of the issue,” the company said in a statement.
Although services are now fully restored, configuration changes to AFD remain temporarily blocked, with affected customers to be notified once the restriction is lifted.
What Happens Next
Microsoft confirmed it has introduced new safeguards — including enhanced validation and rollback controls — to prevent similar incidents in the future. An internal review is also underway, and the company has committed to sharing a Post-Incident Review (PIR) with affected customers within 14 days.
The outage serves as a reminder of the critical role cloud infrastructure plays in global digital operations — and how even minor configuration changes can cause large-scale disruptions across industries.