Back at the start of These Unprecedented Times, my business partner and I hosted two Q&A sessions. Our attendees asked a bunch of questions about AWS accounts, which we’ve compiled here, lightly edited to make ourselves sound way more intelligent.
Do you have any concerns about the organization master account also being the consolidated billing account?
Excellent question. My answer to this has changed recently.
They’ve not done an announcement on this yet, but at least as of the time of this writing the APIs have changed to show that you can now delegate certain tasks from the master account (AWS’s term, not mine, don’t email me) to a member account—something I’ve been ranting about on Twitter of late.
The master account has a lot of power. There are things that you need to do within the master account that can’t be done anywhere else. That’s where your Cost & Usage Reports wind up, for example.
In order to analyze those, you’re probably going to use Athena (or RedShift / Snowflake if the VC Money Fairy has whapped you aside the head with their Money Wand), so you’re going to need the Cost & Usage Reports delivering as Parquet files, and periodically have a Lambda function or a Glue job firing off. There are other tooling options here, but by default every AWS blog post or tutorial on the subject features some combination of these services being provisioned inside of the customer’s master account.
At the Duckbill Group we do all of this for our clients–but we make it a point to copy the data out of the customer’s master account first. This is incredibly important!
Here’s the point: you don’t want any infrastructure to live within the master account if you can possibly avoid it. That account is super-powered, and it’s an AWS best practice to have no infrastructure in that account–or at least, as little of it as you can get away with. Apparently the AWS billing team has yet to receive that memo in full.
Personally, I would love to see additional options around delegating various billing and payment functionalities to a member account. Finance people at mature shops, for example, need to have deep access to a lot of that data. And you probably don’t want them in the organization’s master account for reasons that can become painfully apparent mere moments after you could have benefited greatly from the information. The master account is also what gets explicitly tied to various contract agreements you might have with AWS, the account that dictates what support looks like for the organization, the account that determines who is inside vs. outside the organization, and more. If someone has access to all of it, you’ve got problems. Most mature shops can’t let someone who has access to production also be able to access the audit logs for who has accessed production, because embezzlement.
Bottom line? There are a lot of things that need to be broken apart here, but the quiet API changes offer me tentative hope they’re listening.
Are there any unique aspects to the root user account, what are those aspects, and when would I ever need to use the root account after I set up IAM?
The root account inside of an AWS account is generally used on a day to day basis only by folks who have not been exposed to IAM or who have looked at IAM and run screaming away from the complexity.
It’s an overpowered account in many respects. You’re going to need to use it to change your support plan. You’re going to need it for certain tax credits. And there’s a few other assorted tasks that only the root account can accomplish–but to AWS’s credit, that list is far shorter than it once was.
In practice, the real problem you see around root accounts in responsible organizations is that people do the right thing. They set a tremendously long password. They then set up a multifactor device that’s required to log into the account, and then they secure it in a fireproof safe.
And then what inevitably happens hits two years later when they need to access the account, and they can’t find the safe. If this is you, open an AWS support ticket and be prepared to wait a few days.
What are some of the tagging policies that we recommend to our clients?
Having a tagging policy at all instead of a haphazard mess would be a good first step!
We recommend a few: Team; Department, if there’s a larger organization at play; Accounting (COGS, R&D, G&A being sample values) to make it easy for the finance team; Service to define what larger part of the application it contributes to; and Data Classification, for regulatory compliance (e.g., GDPR, PCI, and HIPAA).
As you’re building out your tagging strategy, remember you want to talk to your finance team. They’re going to be asking questions six months from now and you’re going to want to have the data to answer those questions with a single query, instead of an eight week data science project. Cost allocation tags are not retroactive; they only start allocating from the hour in which you apply them. Ergo, you’re going to have to confront my greatest personal fear: planning ahead.
Also, give up now on the idea of trying to get everything tagged. If you can set a baseline of 80% or better, you’re outperforming almost everyone. And, in most cases, you’re not an organization where you’re required to spend thousands of dollars to allocate that last 20 cents.
Lastly, never ever expect people to tag things by hand. They won’t. If you want decent tag coverage, tagging absolutely must be automated.
And only once all of that’s done do you realize that tagging has serious security implications.
What’s your recommendation on automatically shutting non-development environments down outside of normal working hours?
First off, turning off developer resources out of hours is great for cost savings. But the first time that your 6PM shutdown gets in the way of someone fixing something, you’re going to be in trouble.
I like the model in theory. If I designed something like this, I would do two things. First, I’d validate that developer or non-prod environments are meaningful cost drivers. One time, I got disgustingly far down this path before discovering that it wasn’t a meaningful spend component. “Oh, your development environment is 3% of your bill, and you haven’t bought a Reserved Instance since the Eisenhower administration.”
Assuming the development expense was worth optimizing, I would set up a scheduled instance turn-down approach and I would hook it up to a Slack bot that would then shoot messages to developers, saying something like this thing is turning off for the night in half-an-hour, click here to keep it open. (This is where cost and allocation tagging becomes useful.)
You’re also going to want to make sure that this system is carefully tested and calibrated. It takes more work than you think to get this dialed in. Because remember: the first time the cost savings effort breaks something, you’re not allowed to save money anymore.
If you have more questions about AWS accounts, I’d like to think I have answers. Drop me a line.