I’ve been giving Azure a fair bit of grief lately for some embarrassing information security lapses, and I think it’s only fair for me to explain in a format beyond “some tweets” exactly why that is. The write-ups I’ve seen have all been deeply technical and more or less bury the lede, so let me begin with a quick summary of the three issues that have pivoted my impression of Azure from “serious contender, albeit one that targets a different market than the ones I talk to” to “this is a security clownshow that should be actively avoided.”
Issue 1: Azurescape
In September, Palo Alto Networks identified the Azurescape vulnerability. This is important because it’s the first documented case of a hyperscale cloud provider that “could enable one user of a public cloud service to break out of their environment and execute code on environments belonging to other users in the same public cloud service.”
Let me be very clear here: This is the terrifying outcome when it comes to cloud security. I, as a customer, getting read, write, and execute permissions to your cloud environment is the stuff of absolute nightmares. It validates every crapass “the cloud isn’t secure” take we’ve heard for the last 15 years.
Further, Azure became aware of the issue when it was reported by Palo Alto Network’s security researchers. This means that not only was an escape from the tenant environment into the control plane possible, but that Azure’s security team didn’t detect it until they were explicitly told that it had happened by the attackers.
Issue 2: ChaosDB
In August, the folks at Wiz discovered the ChaosDB vulnerability in Azure’s flagship CosmosDB database. Wiz security researchers were able to gain “complete unrestricted access to the accounts and databases of several thousand Microsoft Azure customers.” If you take a look at who those customers are you’ll see a lot of big company names, none of whom are renowned for being comfortable with leaking customer data.
Once again, the security researchers had access to the control plane, were able to read and write other customers’ data, and Azure was only made aware of this when the researchers reported their findings instead of by the giant pile of alarms that should have been going off as soon as a customer managed to escape the tenant environment.
Issue 3: OMIGOD
The third incident is almost prosaic by comparison. Azure embeds a management agent in many services called Open Management Infrastructure, or OMI (which I am telling you is pronounced with three syllables, not two). In September, the Wiz researchers discovered that there were a series of vulnerabilities in this service that amounted to the ability to run arbitrary code remotely as root if you were able to talk to the port this agent listened on. This was fittingly christened “OMIGOD” after what everyone said when they realized the implications.
I’m not as annoyed by this vulnerability as I am the other two because customer mitigation is and was possible. “Ensure that random networks can’t communicate to the management agent’s port on your VMs” is one of those good security practices that customers should really have implemented as a part of a defense-in-depth strategy. By contrast, the issue with the other two vulnerabilities is that customers were exposed despite doing everything right.
Why do I care?
This is “Last Week in AWS.” I rarely talk about Azure here because as I’ve freely and frequently admitted it isn’t really my area. I don’t see it being used to run large web-scale properties or workloads that aren’t already Microsoft-centric. When I see it in customer environments, it’s to run something like MS SQL Server due to games Microsoft likes to play with licensing when you run their stuff on other providers.
But when I talked about this at re:Invent last year with folks from large enterprises or name-brand analyst firms, the reaction was universally the same: horror at the breach, but also an unwillingness to scold Microsoft too loudly in public because (after all) who can afford to annoy a multitrillion-dollar company? You’ve still gotta do business with them.
I’m willing to take that risk, because my failure mode here is what, that Microsoft gets upset with me and jacks up the cost of my Office365 renewal? I’ll risk it.
Microsoft’s Response
I will get scolded on Twitter for this if I’m not very clear here: Microsoft did in fact issue responses to Azurescape, OMIGOD, and ChaosDB, so it’s not accurate to say that they haven’t said anything. My issue with these statements is the questions they don’t answer.
- Why was it possible for researchers to gain control-plane access you remained unaware of that fact?
- When you say “our investigation surfaced no unauthorized access to customer data,” is that because your logging and telemetry only showed the researchers’ activity? Or is the story closer to “we don’t have visibility enough to say one way or the other?”
How are folks who care about security (y’know, customers?) supposed to in good faith recommend Azure for workloads when these questions remain unanswered? I haven’t seen their execs out on the speaking circuit explaining the answers to these questions. I can’t fathom a scenario in which Google or AWS suffered from vulnerabilities like this and didn’t make loud, sweeping reforms that they talk about constantly just to rebuild the trust that they would have burned through.
No cloud provider “wins” here. The narrative in a lot of enterprise boardrooms these days with respect to this incident isn’t that Azure is sloppy, it’s that the cloud is a giant ball of risk. I admit I used to think that AWS and GCP’s constant trotting out of their security models and how they are structured was boring! “Every cloud provider does things this way! Why are you wasting our time?”
Except that apparently every cloud provider doesn’t do things that way, and “defense in depth” is instead something that’s more aspirational than accurate for one of the three big hyperscalers. Every cloud provider has vulnerabilities; they don’t result in people scampering around the control plane with the squirrels. This is a whole new category of security breach for a cloud provider.
Almost a year ago, I wrote an article called The Future of Cloud is Microsoft’s to Lose. In a lot of ways, I think their laconic approach to security may very well have done exactly that.