Tuning Performance for New and “Old” Friends

Posted by Scott Jehl 03/07/2019

For performance reasons, we often configure our sites to deliver their code in slightly different ways for new and returning visitors. For example, for new visitors, we might choose to inline or http2-push important resources along with a page’s HTML so that it will be ready to render immediately on arrival. Additionally, for a new visit we might set up a page’s HTML to reference its resources in a cautious, fault-tolerant manner, so that they can layer progressively into an already-usable page as they arrive (without overly relying on them to arrive quickly-or even at all!).

However, for subsequent visits to other pages on our site, many of the resources that pages require will already be in the browser’s cache, so we would not want to inline or push files unnecessarily (relying on the browser to cancel them before wastefully downloading them again). Also, given the knowledge that important assets are already cached, our HTML itself can take advantage by referencing resources in a way that will render its completed state immediately, instead of unnecessarily stepping through a progressive render.

Variation, in Practice

In order to implement this sort of variation, we need a way for the browser to inform the server that its cache is populated. Unfortunately, no standard mechanism exists for this in browsers today, though in the future, Cache Digests will likely handle this for us. To mimic this functionality without such a feature, we typically use a cookie: the first time anyone visits the site, we set a cookie including a version number of our current site’s build, and on every visit, the server checks if that current build version’s cookie is present to decide which version of our code to return. This works pretty well, but it has potential gotchas to consider. For example, many people choose to block cookies in their browser, and they may clear their browser’s cache on occasion (perhaps without clearing their cookies as well), and in either case we may be unable to accurately detect if we should send code that’s optimized for a new or returning visitor. Granted, this is not a big problem–if we guess wrong, a page just might take a little longer to load than it could have–but we do want to be able to optimize as best we can!

Recently, I learned that Service Workers might be a better tool for this job than cookies are. We love Service Workers, and we typically use them to manage requests so that a site is faster and more resilient in bad or even offline connections. That said, Service Workers can also manipulate requests before they go out from the browser, changing a request’s conditions and even adding custom headers. For example, here’s an abbreviated snippet from a Service Worker that will reconfigure an outgoing request for any HTML file to include a custom header called X-CACHED, set to a number representing the version of the current site build.

const version = "fgv56";
self.addEventListener('fetch', event => {
if (event.request.headers.get('Accept').indexOf('text/html') !== -1) {
var newReq;
var reqHeaders = new Headers();
for( var i in event.request.headers ){
reqHeaders.set( i, event.request.headers[i] );
}
reqHeaders.set( 'X-CACHED', version );
newReq = new Request( event.request.url, {
method: "GET",
headers: reqHeaders
} );
// fetch event handler continues...
// newReq is now ready to fetch from network or local caches

(Small note: an existing request’s headers are immutable so I had to make a new one and copy things over to it).

The server can now check an HTML request’s headers and make decisions based on the relevant token and version number. For example, here’s a line from an Apache configuration file that will http2-push a CSS file and some fonts if the header and version are NOT present in a request for an HTML file (meaning it’s a first-time, uncached visit):

<If "%{HTTP:x-cached} !~ /fgv56/ && %{DOCUMENT_URI} =~ /\.html$/">
        H2PushResource add /css/dist/all.css
</If>

Further, our HTML templates can look for the custom header too and vary their code output on the fly. For example, one pattern we often use for first-time visits is to load our fonts asynchronously, then apply them to the page after they load via an HTML class like fonts-loaded. This allows us to ensure that a page’s text is visible while those fonts are loading (particularly on slow connections). However, on returning visits to our site, the fonts will often be in cache, so we can set up our templates to include that font class immediately (paired with a few other tweaks), ensuring a nice clean font render:

<html <!--#if expr="$HTTP_X_CACHED=/fgv55b/" -->class="fonts-loaded"<!--#endif --> >

Pairing Header State with Actual Cache State

So far, I’ve mostly reproduced the cookie pattern without using a cookie, but using a custom header has value beyond just that. It can also more accurately communicate that files have actually been cached. The neat thing about a service worker is that it begins working once it has finished “installing”, and that installation can be be configured to pend on certain criteria being met. Here’s an example that I originally learned about from Jeremy Keith, which uses event.waitUntil to delay installation until a list of files has been cached.

const staticCacheName = "fgv56::static";
const staticAssets = [
'/css/type/Lato-Light.woff2',
'/css/type/Lato-Regular.woff2',
'/css/type/Lato-Bold.woff2',
'/css/type/Lato-Black.woff2'
];

function updateStaticCache() {
return caches.open(staticCacheName)
.then( cache => {
return cache.addAll(staticAssets.map(url => new Request(url)));
});
}

self.addEventListener('install', event => {
event.waitUntil(updateStaticCache()
.then( () => self.skipWaiting() )
);
});

To summarize the code above, the last install event binding ensures that the service worker will only install after updateStaticCache() finishes. Of course, that same service worker file also contains the snippet I showed earlier, which appends the custom header to outgoing requests. So that header will only be included in requests if the worker is actually installed, and static files are actually cached. Neat! So it seems this pattern gives us a more reliable way to communicate to the server that files are actually cached.

Thanks!

Thanks for reading along. This article is part of our ongoing process of researching and discovering better ways to handle page delivery so if you have any feedback we’re quite interested in hearing what you have to say. Hit us up on Twitter @filamentgroup with any feedback!

Caveats and Omissions

In the interest of a clearer narrative, I glossed over a few technical details that help make this workflow fully work.

  • For one, when HTML output can vary like this, the server should be configured to send a VARY header to tell the browser that it will vary the output of HTML files based on the header x-cached. That same sort of consideration needs to be made when varying based on cookies too, and it’s the sort of configuration that can help a lot when trying to implement this variation over a CDN that caches static output of files for many users.
  • Also, be mindful that if your service worker caches HTML files for potential offline use, it will cache those files including their x-cache header, so any future requests for those files in local cache will need to include that header as well. This tripped me up when attempting to retrieve our site’s offline.html fallback page for example, which was already stored including that custom header. Instead of merely fetching it by URL, I needed to build a request with the custom header to match.
  • Lastly, since it’s possible for users to delete files from the Caches that are used by service workers, it is possible that the service worker could continue to set the custom header despite files no longer existing in cache. However, when clearing browser caches, these particular caches seem to be coupled with other “site data” (rather than being part of the regular browser cache) and clearing that site data will delete the service worker along with the cached files. My assumption is that this means the likelihood of a mismatch between the header and the cache state is pretty low. That said, if someone wanted to be extra careful to only send the header when appropriate files are actually cached, they could add a step to the request workflow above that compares the files in the array to the files actually in the cache. To me, this seemed like overkill, but it might be a good idea to explore.