I’ve given so much grief to the AWS Managed NAT Gateway over the last few years that if I were to pass all of that grief through one of the gateways themselves it would bankrupt my company. It occurred to me that while I’ve talked about my problems with the service in bits and pieces all over the place (read as: on Twitter, and incoherently over drinks in Seattle), I’ve never sat down and laid out my problems with it in a single place. It’s definitely time to fix that.
Before NAT Gateway, a pain in the butt
Let’s start at the beginning: When you set up a subnet inside an AWS Virtual Private Cloud (VPC), you have the option to route its traffic to an internet gateway. If you do this, it’s what’s known as a public subnet. If you don’t, it’s known as a private subnet. Nodes in that private subnet may still need to talk to things outside of that subnet. To allow this, you used to have to build and run your own NAT (network address translation) instances. This was a colossal pain in the butt. It required a lot of nuanced configuration, and these instances were effectively single points of failure for an entire subnet; using auto-scaling groups or load balancers to make it more available was obnoxious.
Then in 2015, AWS launched a Managed NAT Gateway service, and It. Was. AWESOME. Suddenly you didn’t have to jump through all of these hoops to run something delicate and complicated yourself; you clicked a button in the console or added a line of CloudFormation and it just worked. We quickly entered a place where the only people who ran their own NAT instances were either fossils from an earlier time or folks with very specific needs.
But there was a problem.
Surprise, you’ve got fees
The Managed NAT Gateway charges a fee for every hour that it’s running. That’s 4.5 cents per hour in the tier 1 regions. For large or enterprise customers, that’s comfortably in “nobody cares” territory. The trouble with this is an awful lot of tutorials set up private subnets as a matter of course, and it’s not immediately obvious that a Managed NAT Gateway is included. Further, there is no free tier for this service. Ergo, you have a student learner firing up a free tier account and suddenly getting slapped with a Surprise Fee when the monthly bill hits. It’s bad business, and leaves a very sour taste of AWS when that’s your first encounter with its billing system.
If this were its only billing dimension, I’d be annoyed but would have gotten over it many years ago. This is annoying, but it fits into my larger please fix the AWS free tier argument.
The bigger problem is that AWS also charges 4.5 cents per gigabyte passed through the gateway as a “data processing fee” that’s completely separate from any data transfer fees assessed. And that’s where the thing melts down.
Fixed transfer fees add up fast
Recall that in us-east-1 (or other tier 1 regions) moving data between availability zones within a region as well as between some regions costs 2 cents per gigabyte. Sending that data to the internet costs 9 cents per gigabyte. Storing that data in S3 for a month costs 2.3 cents per gigabyte. Sending that data to a satellite in orbit via Ground Station costs I have no idea how much — I just think it’s incredibly nifty that this is a real thing that you can do and not something from a sci-fi novel.
But the Managed NAT Gateway data transfer fee remains fixed at 4.5 cents, with no volume-based price breaks. And it drives me up a wall because it’s just so egregious once you hit nontrivial data transfer volumes.
How the conversation goes (unpleasantly)
When I’m looking at a client’s AWS bill and see significant Managed NAT Gateway data processing fees, I get a sinking feeling in my gut because I know that the customer is not going to be happy with what I’ve found. There are a few ways that conversation plays out, and none of them are pleasant; the customer invariably gets a harsh introduction to the facts of life as they discover just how thoroughly they’re being fleeced.
“We’re putting a petabyte of data through that a month, but you don’t understand: We’ve gotta get that data to and from S3.” I get it; I’m not suggesting you change your data flow! But if you add a (completely free) S3 gateway endpoint to your private subnet, suddenly that petabyte of traffic to and from S3 stops costing you $45,000 a month and becomes absolutely free. The fact that this isn’t set up by default is a rant for another time.
“We need to move a petabyte a month to and from the internet, and we can’t move the EC2 instances doing that into a public subnet due to Compliance.” I’m not one to argue with compliance requirements, I assure you! But in a scenario like this, setting up your own Managed NAT instances and running them is a clear win. Yes, it’s finicky and annoying! Yes, it increases your team’s operational toil. But how much does it cost you to put that responsibility onto an existing team or hire a third-party consulting company (not us!) whose sole job is to run a set of NAT instances for you? If the answer is “less than $495,000 a year” (and it had damned well better be!) then you’re coming out ahead here.
“Wait, you’re telling me that this one change just paid for our entire consulting engagement with The Duckbill Group?” Yes! Many times over! And I promise you, this brings me no joy whatsoever. This is a great example of why we only ever charge a fixed fee for our cost-optimization projects. Can you imagine how royally pissed off a client would be at having to pay a percentage of their Managed NAT Gateway charges to us via some sort of “we charge you a portion of the savings” cost model? They’d be right to be upset — this isn’t high-value undifferentiated work, it’s pointing out a stick that’s used to smack an awful lot of customers.
It’s not the service, it’s the fees
My issue is not that the service is bad; far from it! This is exactly what I want AWS to be building: services that reduce toil and remove undifferentiated heavy lifting that every company has to do themselves. Running your own NAT instances is a terrible practice that I strive to avoid! It’s solving a global problem locally, and if we’ve gotta do that why are we even using cloud providers in the first place?
No, my issue is solely around the pricing of the service at both ends. In isolation, a Managed NAT Gateway doesn’t do anything! I can’t spin up Managed NAT Gateways to serve web traffic, or mine bitcoin, or have it be misused as a database. If you gave me a magic wand, I’d either make the service entirely free or offer a generous free tier and wipe the data processing fees entirely.