AWS Data Transfer billing is, to understate massively, arcane. At a high level:
- Data transfer inbound is free.
- Data transfer to the internet costs a larger pile of money.
- Data transfer between availability zones within a region costs a pile of money (usually 2¢ per GB in the “main” regions, but costs can spike way higher).
Most of this is more or less understood. AWS certainly doesn’t help matters any by referring to #3 as “regional data transfer – in/out/between EC2 AZs or using elastic IPs or ELB.” It feels like accounting misdirection. It further doesn’t help that every GB transferred between AZs appears on your bill as 2GB.
Let’s also point out that this data transfer between AZs isn’t just bound to “AWS accounts you control.” If you’re in us-east-1 and have customers sending telemetry to you in large volumes from environments that are also in us-east-1, relocating your collection endpoint to another region or cloud provider suddenly makes a lot of sense. Someone at a big monitoring company just spat coffee all over their screen while reading this paragraph. Call me!
What’s worse is that — unlike every other AWS resource — you can’t pay less to degrade the experience of your data transfer. You can’t say “charge me less and reduce my replication time to ‘a week from Tuesday'” in the same way as you can save money by reducing RAM or disk. This gives the mistaken appearance that there’s nothing you can do about this data transfer burden.
Today, I’d like to talk about a variety of ways to work around AWS’s data transfer pricing; some may be more viable for your use case than others.
Talk to your account manager
I would be remiss if I didn’t start by suggesting you talk to your account manager about your data transfer woes. At scale, discounting becomes an option; if your use case is significantly strange, AWS may be able to do more for you.
I don’t expect this to be a panacea. Rather, the reason I’m starting with this point is that whatever you try next may generate a “what are you DOING” response from AWS, and by starting with the Proper Way, your position becomes much more defensible. “Well, I did ask,” is your rejoinder.
Transfer less
In most cases, traffic between AZs is some form of replication traffic.
If you’re keeping four copies of the data across four AZs, consider whether you can get by with three. Alternately, if that’s untenable, you’d do well to remember that inbound traffic is free. So what about having your customers send you their data twice?
Sure, mobile users aren’t going to love you for this. But if your model is “20 MB a month across half a billion customers each,” who’s really going to notice?
And of course, stop using Kubernetes. It’s invariably never aware of AZ affinity, so it will gleefully toss traffic across expensive links instead of free ones. It looks to AWS like one big misbehaving application.
Use managed services
If you’re running databases, message queues, or other services yourself on top of EC2 instances — or are paying a third-party vendor to manage their offering for you — consider an AWS managed service that offers free replication traffic cross-AZ.
RDS, MSK, Amazon ElasticSearch are all examples of this. Are they less capable and less tunable than running it yourself? Absolutely. Are the savings significant at scale? Unquestionably. Is this an antitrust issue waiting to happen? You would think so!
On some level, AWS doesn’t need to make its services clearly better. They need to be “good enough” to not get their offering rejected out of hand — because the data transfer pricing becomes a compelling argument for some workloads at scale.
Misuse storage services
A number of services don’t charge for cross-AZ data transfer more directly than the previous examples.
S3 has no cross-AZ fee; data transfer is free with only a request charge to contend with. If you can pass data cross-AZ using S3 as a conduit, go you.
Similar patterns apply to EFS if teaching your applications to speak “object store” is too heavy of a lift. There’s similarly no cross-AZ charge for DynamoDB if you’d rather run a database as a message pipeline; you’d just pay for throughput. Set a reasonably low TTL on your items and you don’t even have to clean up the storage usage after you’ve moved the data!
Misuse other services
Neither SNS nor Amazon Chime charge money for data transferred cross-AZ.
Note that SNS charges per message, but Chime doesn’t. “Wait—that seems like something that might get Amazon to shut down your Chime account” is the logical objection here. But, as a counterpoint, imagine how sad Amazon Chime’s DAU numbers would be if they ripped out all of the bots using it for backend communications like this!
Sue AWS for something
This one works way better for data you want to extract from AWS entirely — think an exabyte of data stored in S3. Your solution here is to invent a reason to sue AWS for something.
Keep in mind that it need not be a good reason. As long as it doesn’t get tossed out during summary judgement, you can demand all of your data during the discovery phase of the suit by attesting that it’s relevant to your nonsense lawsuit.
While I wouldn’t put it past AWS to provide that data via 40 tractor trailers full of paper printouts, that would require them to demonstrate way more creativity than service names like “AWS Trainium” would indicate they possess.
What terrible ideas do you have?
So there you have it: a general overview of various end-runs around AWS’s egregious cross-AZ data transfer pricing.
Have I missed any (the more ridiculous the better!)? Be sure to reach out and let me know.