We need more inclusive web performance metrics

Posted by Scott Jehl 07/06/2020

There’s been great momentum in the web performance tooling world lately and our ability to identify and fix bottlenecks that impact real users has never been better. Perhaps the most important outcome that’s come from this momentum has been in identifying particular page loading metrics matter most to real users. Among those, user-“perceived” metrics like First Contentful Paint and Largest Contentful Paint—metrics that mark meaningful moments in the visual page loading experience when content is visible (though still not yet interactive) and can be digested by users—get a great deal of attention.

These metrics are often touted as measures of usability or meaning, but they are not necessarily meaningful for everyone. In particular, users relying on assistive technology (such as a screenreader) may not perceive steps in the page loading process until after the DOM is complete, or even later depending on how JavaScript may block that process. Also, a page may not be usable to A.T. until it becomes fully interactive, since many applications often deliver accessible interactivity via external JavaScript. Like other areas of performance, JavaScript delivery and application can play a big role in how soon a page becomes accessible, and Marcy Sutton has done great work to shed light on the impacts of that relationship (see How React.js Impacts Accessibility and Accessibility and Performance and a fascinating discussion linked from that last post about accessibility tree timing). Still, as well as this intersection may be understood, these timing metrics are not commonly exposed in our web performance tools. As Léonie Watson tweeted, “it’d be good if performance tools could measure the time to the accessibility tree being created and/or the time to first accessibility API query”.

In our ongoing push for practices that produce inclusive and accessible experiences by default, we need our performance metrics to be inclusive as well.

Some thoughts on what that could look like:

It would be useful to have insight into the moment when assistive technology is able to interact with and communicate page content, so that we can know when a page is “ready” for all users, and not just some. If possible, this measurement could factor into existing metrics that already represent page “readiness,” rather than merely adding a new “accessible-ready” metric. In other words, if the page isn’t ready for everyone yet, it isn’t ready yet.
It’d be interesting to know which existing metrics are irrelevant to assistive tech, for example, if a particular metric occurs before the accessibility tree is exposed.
I also wonder if it might be interesting to measure “jank” and stability in the process of arriving at a usable accessibility tree. For example, in a server-side-rendered react/vue/etc scenario, is the accessibility tree initially created one way and later “hydrated” into a much different state? If so, is this akin to say, layout stability metrics we already track, which disrupt the visual user experience?
Lastly, how are metrics like First Input Delay translating to the interaction time that someone experiences when using assistive technology? Is an application unable to respond to interaction quickly enough to meaningfully communicate what’s happening to the user?

We should want to know if our patterns are optimizing for visual use cases at the expense of others, particularly when many of these patterns excel at making increasingly-slower applications seem like they’re loading fast, perhaps enabling worse real-world performance for many.

In an effort to further this conversation, I filed a feature request for the Lighthouse team and also with WebPageTest to review. Several folks with a deep knowledge of this have already chimed in with information and support. If you’re able to add insight or even just affirmation that this sounds like something we need, please do!

Here are those issues:

All blog posts