Blog Post

How to Improve the Resource Timing API

W3C has worked the Resource Timing API Specification to help fill the Real User Monitoring gap. Learn about this API's major shortcomings.

Getting visibility into webpage performance as experienced by the end user is key to any company. Over the last couple of years the amount of Real User Monitoring methods and tools available to DevOps teams has skyrocketed to meet this demand.

However, the key method we all rely on to collect the data, the Navigation Timing API, lacks visibility of third party performance, or really anything other than the root request timing and page level metrics. Luckily the W3C was kind enough to work the Resource Timing API Specification to help fill that gap, and the spec was adopted by Chrome, Firefox, and IE. Sadly, as Steve Souders pointed out in November, there is a major shortcoming with this API to the point that it is rendered almost useless.

Resource Timing Shortcomings

The Resource Timing API allows developers to get the following data on each URL loaded from the page:

  • connectEnd*
  • connectStart*
  • domainLookupEnd*
  • domainLookupStart*
  • duration
  • entryType
  • fetchStart
  • initiatorType
  • name
  • redirectEnd*
  • redirectStart*
  • requestStart*
  • responseEnd
  • responseStart*
  • secureConnectionStart*
  • startTime

For any content that doesn’t share the same domain as the document (i.e. any third parties and domain sharding), additional steps are required to obtain any of the metrics above that are marked with an asterisk. To get them, each HTTP response must include the Timing-Allow-Origin header. Without including the header, the closest metric for understanding the impact of a particular request is “Duration.”

Sadly, few third parties have implemented this header; Facebook is the most prominent to have done so. But Google, probably the largest third party provider on the web when you factor in their Ads and Analytics platforms, is a supporter of the API in Chrome, but has yet to adopt the header for their many products.

Unfortunately, duration is flawed since it includes the time the request was blocked by another request or the rendering logic of the browser. In other words, it includes a portion of time that is due to the page rendering (not the servers), and another portion encompassing the communication over the network.

Relying on this data to report on third party performance is a huge red herring and can place the blame on providers who actually had little or no impact on perceived and total load times of the page. For this reason, we at Catchpoint have delayed inclusion of any features utilizing the Resource Timing API in our real user monitoring product.

The Duration Trap

Interestingly, the issue of “Duration” including browser time also impacts network browser tools like Chrome Developer, Firefox Developer Tools, and Firebug. In these tools, the “Duration” metric for each request listed includes Blocked (or Stalled) time – which is the time between the browsers detecting the request and starting an HTTP request.

Clearly the developers of the tools followed what seems logical from the browser perspective; the Duration is the time from detecting the request to finishing it. However, the end users of these tools often read it from their perspective, i.e. how long it took for the URL to finish loading.

Fixing the API

To solve the shortcomings of the API, Steve Souders proposed a new metric called “networkDuration” (domainLookupStart to responseEnd). While the addition of “networkDuration” in the API would be a great gift for all of us RUM fanatics (Steve kicked off this request here), we propose that the need for Timing-Allow-Origin be removed entirely from the Specification.

The “networkDuration” is the sum of the various components (DNS, Connect, TTFB, etc) and is not offering any more privacy protection than providing each of the values. The privacy concerns over knowing how a URL performed are minimal, as revealing this data would hardly impact any company (you can measure this data through the browser network tools or synthetically today). Neither does this data impact the privacy of the end user, as none of the values are from user behavior or personal information – nothing private between the end user and the target server.

If anything, the owner of the webpage must demand that the performance data of their partners – the third parties on the page – is available to them as owners since it impacts their performance. Since the end user is ignorant about the third parties, any performance issue they perceive on the page is attributed to the webpage (or their internet connection) rather than the real culprit(s).

And as long as we’re enhancing the spec, it would be great if we could get something added regarding support entries of URLs that fail. Those which time out are currently not reported on at all, leaving another hole in the visibility that it provides.

Conclusion

Today’s Real User Monitoring methods and tools do not provide clear insight into the performance of URLs loaded by the page from other domains. Those who rely on the Resource Timing API might be collecting the wrong measurements and pointing fingers at the wrong companies.

It is time to change this API. Let’s remove the privacy limitations it has – they are not protecting anything valuable, they are simply impeding companies from measuring clearly what they must measure.

Synthetic Monitoring
Real User Monitoring
Network Reachability
DNS
API Monitoring
DevOps
Workforce Experience
SaaS Application Monitoring

You might also like

Blog post

Calling all Reliability Practitioners: Participate in the 2022 SRE Survey

Blog post

What Can We Learn from AWS’s December Outagepalooza?

Blog post

Empower the SREs - Conclusions from The SRE Report 2023

Blog post

3 Lessons from a DNS Resolution Failure Incident