A few hours ago, I asked folks for questions about their AWS bills. I picked a dozen to dive into in a bit of depth.

The answers are surprisingly nuanced, and as it turns out, not really a fit for Twitter since nobody wants to read a 400 tweet thread, so please forgive my rambling in blog form.

Question 1: CloudWatch Metrics

Our first question:

Figuring out which CloudWatch metrics cost money is an exercise in frustration—an exercise, incidentally, that doesn’t add any direct business value to what you’re otherwise working on.

A good baseline is to figure that any time you enable detailed metrics, it’s going to cost you. If you enable the CloudWatch Agent so you can expose memory metrics to CloudWatch, that’s going to cost you, too. In other words, if you don’t have the option to disable a metric, expect it to be free; everything else will show up in your bill.

What makes CloudWatch extra fun is you also get charged to query them. Take a look at whatever monitoring system you have—Datadog, SignalFX (now captured by Splunk), Prometheus, what have you. It’s going to cost money for every metric it pulls. So stop querying for metrics you don’t care about.

A classic example here is EBS volumes. You care about some things around EBS—but they’re usually not hypervisor level, which is all that CloudWatch gives you by default.

Instead, you care about host-level metrics, such as “is my volume filling up?” Install an agent and query that, then disable the EBS volume monitoring via CloudWatch.

Question 2: Savings Plans

Our next question is around Savings Plans:

I’ve written a post about Savings Plans that gives the basics for this, but let’s go into the application method. 

Each hour, AWS looks at the qualifying spend items (e.g., EC2 instances, Fargate containers, or Lambda functions) you’ve run in your account, and sorts the usage for that hour based upon greatest discount. It then applies the savings plan to the best option and moves down to the next item for as long as there’s capacity.

This means two things:

  1. You’ll always get the savings plan applied to the place where it will save you the most money.
  2. It’s freaking impossible to predict exactly how much you’re going to save. The best option you’ve got is to look at your historical usage and run some scenario modeling. “If you’d committed to spending $4 an hour in savings plans, you’d have saved x per hour over the given timespan.” 

This requires a lot of custom modeling work. Fortunately for you, we’ve built something that does exactly this; if understanding what a given SP purchase is going to save you before you buy it is important to your finance folks, please get in touch!

Given the access this thing requires to highly sensitive data, there’s unfortunately no responsible way to turn this into a SaaS product.

Question 3: Competitive Analysis of Azure

Our next question is about a competitive analysis of Azure:

https://twitter.com/blatanterror/status/1240030051313135619?s=20

There’s no cloud provider I’ve ever seen that offers both metered billing and an easily understandable bill. Digital Ocean comes close, but you’re paying a fixed fee for a given service up to some limit; it’s not exactly the same thing.

What makes Azure interesting is that almost any company with a sizable Azure footprint’s spend is governed by a custom Enterprise Agreement with Microsoft. It’s not just about the knobs and dials of tuning their workloads for efficiency, but also about a whole world of license compliance.

What’s considered a permissible use of a given Microsoft license that you’ve purchased inside of various cloud providers? I’d need to be either an attorney or a Microsoft licensing expert in order to tell you accurately. Ergo, I stay out of those waters entirely.

Question 4: The Many Bills of AWS

Next we get into AWS’s own worst enemy: themselves.

There’s an old saying: “A person with a watch knows what time it is; a person with two watches is never sure.” AWS, of course, has five watches.

This is a deceptively complicated question to answer, given that there are multiple ways to view the bill:

  • The invoice. It’s labeled “invoice” and it’s what you pay. It’s the One True Bill. If you don’t pay this, bad things will happen to you.
  • The bill, which should match the invoice, but it may not. The invoice gets finalized a few days after the billing data does. It’s a way to see billing data before the end of the month and contains a lot of line item usage data. Duckbill uses this to spot-check spend. One of my favorite party tricks is a “dramatic reading of your AWS bill;” this is the view I use to do it. 
  • Cost Explorer: This is a tool that provides a way to view billing data that allows you to query / slice / dice it to pieces across a lot of different dimensions. This is the only place you’re likely to see raw usage and billing data.
  • Cost and Usage reports, or CURs: The underlying data that Cost Explorer lives on top of–from what we can tell. It’s never been entirely clear to the outside world what powers Cost Explorer, due to what we can only assume is a deep and abiding sense of shame on the part of the team that built it. (I mostly kid; it’s worlds better now than it was at launch!)
  • The Detailed Billing Reports: These are now deprecated, but are (almost) functionally the same thing as the CURs. 

The amortized view within Cost Explorer is a way of normalizing the bill over the course of a reservation–otherwise it’ll look like you spent an awful lot one month, and next to nothing in subsequent months for the duration of the reservation.

Blended costs are viewable at the master payer level, and that shows cost across all linked accounts. We’ve yet to see a customer care about blended rate; they bias instead for amortized rate in every single environment we’ve seen to date. It was useful in the days of DBRs, but with CURs it’s not needed anymore. If you’re tracking blended cost, please get in touch–I have oh so many questions for you!

Unblended cost is what you should be using to stay on top of your AWS usage normally. Amortized cost is great for seeing total cost with RIs/SPs over the RI’s/SPs lifespan but can hide usage data, so be forewarned.

The trouble is that none of these agree with each other a fair bit of the time.

It’s also worth pointing out that I tend to disregard bill data that’s newer than two or three days old; there’s an eventual consistency issue within the billing system which means that you won’t have a fully accurate picture for a while. Granted, the window is less than that. But there have been enough historical issues there that I tend to give it a few days grace period and just work on settled data for my own peace of mind.

Question 5: Delinquency

Here’s a fun one: delinquency!

The honest answer here is “I have no clue.” I’ve never tried not paying an AWS bill, given my pesky habit of not making vendors chase me down. Further, my customers are running large workloads on AWS, and business wouldn’t be good if those workloads were suddenly turned off. 

I have seen “overdue” payments go many months. But those are invariably usually due to strange billing artifacts that haven’t caught up within the automated systems. 

For better or worse, I’ve never yet seen one of these rise to the level of causing an automated AWS account restriction; their account teams are on point!

In other words: I hope I never find out.

Question 6: Free Tier

Next, a free tier question:

If this is a significant burden for you—who am I to say it’s not—spinning up the environment within a new AWS account might make sense. 

Now, this is against their terms of service for the free tier, but so is how they handle multiple accounts within an AWS Organization! 

In my experience, every time you spin a new account within one of those, the free tier is applied to it. I didn’t ask for it, I didn’t want it, I don’t count on it—but there it is.

Downside: You’ll get another barrage of email from AWS marketing telling you about the new account you’ve spun up for every damned account.

Question 7: Cost Allocation Tags

Let’s talk Cost Allocation tags!

Cost Allocation tags show up in your Cost and Usage Reports, as well as within Cost Explorer. They let you slice and dice your bill along those tagging axes. They’re not retroactive, so you get to figure out today what questions you’re going to want to answer in a few months.

At a glance, my favorite are always “Environment” (think development versus staging versus production, though these should ideally be separate AWS accounts entirely), “Application” (for instance, “Data Warehouse,” or “Web site”), and the ever popular “Team” or “Cost Center,” which is likely to be highly company-specific.

Others that the Duckbill Group recommends include “Team,” “Accounting Lines” or my personal favorite, “DataClassification” to identify whether or not a service contains any sensitive data; if you’re a regulated entity, this one’s for you.

One note of caution: don’t attempt to either set tags manually (it’s a losing battle) or strive for full tagging coverage; you’re looking for answers here that are directionally correct. Chasing the last 5% of coverage will drive you up a wall, not least because there are still some services that aren’t taggable.

Question 8: S3 API Usage

To that end, our next question:

S3 API calls are a nightmare to track. 

The answer here either lies within tags (which you’ve almost certainly forgotten to track) , CloudWatch, or (and this gets spendy at scale) CloudTrail once you enable data events. With CloudWatch, you can see request metrics on a per-bucket basis, and—if you set up metrics filters in the bucket configuration—what prefixes are seeing the request traffic. This requires a bit of insight into how your buckets are structured. But you can pretty easily get to a point of seeing where the most expensive API calls are hitting you once you get the hang of how S3 metrics work.

If you opt instead for CloudTrail data events, understand that every ten thousand API events costs you a penny. If you’re trying to grasp what’s going on at scale, this isn’t going to be a small number if you forget to turn it off after your analysis is complete.

Question 9: Billing APIs as a Database

Next, let’s misuse things as databases:

Hell yes you can misuse the billing APIs as a database!

Remember, Cost Explorer API calls cost 1¢ per request. Sure, it’s account-wide. But, lest you forget, AWS accounts are free within Organizations. 

It’s a LOT of work for a relatively low volume database. But I’ve done far dumber things…

Question 10: Client Blunders

Next people want me to dish on client blunders:

https://twitter.com/udakil/status/1240025101174628353?s=20

I’m going to instead point out something that everyone misses—and that’s an induction to the Church of Our Lady of Turn It The Hell Off.

Our Lady implores us to turn things off when we’re not using them.

You copied a bunch of data to another bucket so you could experiment non-destructively? Our Lady compels you to delete it when you’re finished.

You spun up a few EC2 instances to benchmark how they handle a workload? Our Lady would like to remind you that those things cost money, so terminate them after you’re done.

You will never save more money than you will by turning things off.

That said, if you’re a Google employee, please instead make your way to the Wait I Was Using That Fellowship.

Question 11: Categorization

Next, a questioner attempts to plumb the depths of the unknowable:

I have no earthly idea why AWS categorizes things the way that they do. The bill feels like the purest expression of “you ship your culture,” and their org chart is truly bizarre.

With respect to tracking charges, this is why tagging is so important. If you have a bunch of Lambda functions that are part of an application, you can associate costs with things like network egress via cost allocation tags. Without it, you’re reduced to parsing VPC Flow Logs like some kind of animal. It’s disgusting.

Question 12: AWS Root User

And our last question is about the monstrosity known as the AWS root user.

You need the AWS root user to do a few things. Once you’ve delegated billing permission to IAM, the remaining tasks relevant to billing are:

  • Changing your support plan (hoo doggy does this have billing impact!)
  • Viewing tax invoices for some locations
  • Registering as a seller in the Reserved Instance Marketplace
  • Signing up for GovCloud
  • Closing your account

In other words, you’ll almost never need to do most of these.

And there you have it. Another Twitter AMA in the books.

In case you hadn’t noticed, I take questions from time to time, and do my darndest to respond to most of them. Ping me on Twitter to get yours answered–perhaps in shorter form next time!