I begin this week with an admission.
A while back, I opined on Twitter that a site I ran, stop.lying.cloud, was unacceptably slow. To save you a click, that site retrieves the AWS status page, applies some transforms to clean up part of the “endless sea of green,” and returns the result.
It did this via Lambda@Edge. I now know that this was wrong.
Instead of the ~23 seconds it took when I made the complaint, it now comfortably and reliably returns in less than a second every time.
The fix, as helpfully suggested by Randall Hunt, was dead simple: Stop trying to do this for every request.
Instead, have a Lambda function fire off once a minute, perform the transform, stuff that into an S3 bucket behind CloudFront without caching, and call it a day.
Did you know that?
Two groups of people will read this and have different responses:
- Group 1: No kidding, of course that’s how it works.
- Group 2: Holy crap, I can’t believe I didn’t think of that. They might get angry about that.
I was solidly in Group 2 on that one, and I suspect I’m not alone.
This distills down to the mental model I held for Lambda functions, specifically Lambda@Edge. My mental model was that Lambda would apply per request, operating on the payload of that request. This is how it works for every example you’ll see that uses Edge functions; from high-level dynamic responses to injecting the same static headers on every request.
People think about this in a steady, reliable way: Lambda fires off when there’s work to be done.
If you can decouple how you think about Lambda from the request invocation model, two things happen:
- It absolutely broadens the applicability of Lambda to a variety of problems you may have previously considered it ill-suited for due to cold starts, latency issues, scalability, etc.
- Purists will be mad at you.
Some people (me, for example) treat Lambda as a synchronous compute resource when it’s also a highly capable asynchronous response resource. (Don’t confuse this with sync/async invocations; this is another layer of the stack entirely.)
In other words, if you can “preload” the work, then the time it takes the Lambda to provision, invoke, execute, and return becomes irrelevant.
The billing also becomes advantageous above certain volumes; the Lambda charge for this example remains fixed regardless of how much or how little traffic the site receives.
The root of this divergence of understanding comes down to how Lambda was originally positioned. It started out as a service that was defined by its restrictions; only certain languages were permitted, you could only have a certain runtime per function, cold starts were a serious problem, you could only allocate so much RAM per function, and so on.
Over time, these restrictions started being relaxed and the service expanded. So, a lot of those early understandings of the platform became rapidly out of date.
Latency, schmatency
The idea of a Lambda invoking on every user request was in no small part how the vision of the service was presented. My new approach of having it fire off once a minute like some kind of blunt instrument is anathema to that perspective. But it’s also effective.
Suddenly, latency of the Lambda function no longer matters; if it takes 2 seconds to run or 20, it’s all the same to the functioning of the stop.lying.cloud site. Should it completely fail to execute because distributed systems are hard, viewers will simply see a site that’s a minute out of date.
It’s still a horrifying site—the HTML that’s generated is 7.5MB because of what it’s forced to deal with, and there are no doubt further optimizations that can be made—but the honest truth of the matter is that I don’t care. It’s a toy site that I came up with as a joke—and to get experience with a few different technologies.
I no longer need that experience. So, “just do the thing and get out of my way” becomes my perspective—and what that lacks in elegance, it makes up for in “solving the problem so I can focus on higher-value tasks.”
This is, more than anything else, the actual spirit of Lambda (and Serverless as a whole, for that matter): getting the technology out of the way to solve the business problem as efficiently as possible.
If it’s not done in the most technically elegant way? Well, you’re probably doing it right.