Fixing Obscure Bugs: Apache, GZip, ETags, and Edge Compute
This post highlights an interesting use case for using edge compute to solve an obscure performance bug with Apache's GZip module.
Join the DZone community and get the full member experience.
Join For FreeI was recently introduced to an obscure performance bug with Apache’s GZip module and a clever solution using EdgeWorkers. So, I made a video and a small blog post to cover it.
One thing I love about my job is when I get to work with people that are way smarter than me at things that I never had the interest in or the opportunity of working on.
For example, although Apache is arguably the most popular HTTP server technology on the planet, it’s not without flaws. If you look back through the bug reports, you may find one called, “Incorrect ETag on gzip:ed content," which was reported in 2006 by Hendrick Nordstrom.
Hendrick noticed that entities that were GZipped using the mod_deflate
module still carried the same ETag as the plain entities, causing some sort of inconsistency with ETag aware proxy caches.
If you’re not familiar, ETags are HTTP response headers that are basically used to identify a specific version of a resource. The value of the ETag is often a hash value that’s generated from either the file contents or the date modified or some combination of the two.
It might look something like this:
ETag: 27dc5-556d73fd7fa43
If an HTTP request includes an If-None-Match
header that matches an existing ETag response header, a proxy cache (like a CDN) can assume that file has not been modified and may respond with a 304 “Not Modified” status.
In theory, this would be faster and save bandwidth.
When you enable the GZip compression module in Apache, it will actually append a little “-gzip” suffix to the end of these generated hash values for the ETags. Unfortunately, when Apache goes to compare the If-None-Match
header (“27dc5-556d73fd7fa43-gzip”) to the ETag value (“27dc5-556d73fd7fa43”) it doesn’t account for the suffix.
As a result, it’s always going to serve the entire resource payload, which is ironic considering ETags and GZip are both ways to, in theory, save bandwidth.
I hope I was able to explain this issue clearly, but if I haven’t, there’s an excellent blog post entitled “Apache, ETag, and ‘Not Modified’” on Flameeyes’s Weblog. It even provides the solution.
So, if the problem is already solved, why am I talking about this today?
Well, the solution requires that you modify the Apache config. But as developers these days, I know all too well that there are many cases where we can’t do that ourselves. I may not have permissions, or I may be working with a third-party service.
And here’s where the interesting solution comes up.
Akamai has a product called EdgeWorkers, which can basically sit in between a client and an origin server and run JavaScript logic.
So when a client makes a request, the EdgeWorker can intercept the request and modify it on its way to the origin server. Or, when an origin server makes a response, the Akamai EdgeWorker can intercept that response and modify it before it gets passed along back to the client.
In theory, a developer that doesn’t have access to the Apache config can still monkey patch a temporary fix until the solution can be solved at the Apache config level.
It might look something like this:
// Modify origin responses
export function onOriginResponse(request, response) { // Grab the server name const serverName = response.getHeader('Server')[0]; // Grab the ETag header const etag1 = response.getHeader('ETag')[0]; // Check if it's an Apache server and the ETag ends with '-gzip' if (serverName.startsWith('Apache') && etag1.endsWith('-gzip')) { // If so, strip out the '-gzip' suffix response.setHeader('ETag', etag1.replace('-gzip', '')); }
}
I realize that this may not actually apply to a lot of folks out there, but I wanted to share it because it strikes a nice balance of obscure bugs, deep technical knowledge, and only a few lines of code to fix. Especially when you include the Akamai EdgeWorkers part, it’s just a cool, elegant approach that I never would have thought of before.
So, I hope you also found it interesting, even if it’s not immediately relevant to you today.
Published at DZone with permission of Austin Gil. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments