The Hidden Cost of Web Components

This post was originally published on the Rigor Web Performance blog. It is based on a talk I gave at the Atlanta Web Performance Meetup. Here are the slides from that talk.

Modern websites make a lot of requests. And I mean a lot. And many of these requests are to third-party resources. As this trend continues, it is important to routinely analyze the performance cost of your site’s resources to identify areas for optimization.

One approach to such an analysis would be to aggregate requests at the domain-level. Using raw HAR data, the data that underlies the popular waterfall chart, we can calculate the performance cost of each domain that our site uses.

Using this HAR as an example, our domain analysis for the five slowest domains would look like this:

http://rigor.com/wp-content/uploads/2014/11/screenshot-2014-11-26-at-1.38.31-PM-e1417027172979.png

CNN Domain Analysis

This approach makes it obvious which domains contribute the most to our overall load time. But now what? One option is to eliminate requests to a given domain to reduce its cost. For example, let’s remove all requests to z.cdn.turner.com. A quick scan of the page source reveals nine references to this domain:

https://docs.google.com/a/rigor.com/uc?id=0B4OqDVTQ1tMPQ1d3WmFYZHRNT3c

CNN Page Source

Removing all nine of these should do the trick, right? Unfortunately, no. Looking back at our domain analysis, there are actually 30 requests being made to this domain. So where are the other 11 requests coming from?

Tracking down requests with HTTP Referer

To find the 11 other requests, we can use the HTTP Referer request header to reevaluate our HAR data. This header identifies the resource responsible for making a given request. Here is what the referer analysis looks like for our example HAR:

https://docs.google.com/a/rigor.com/uc?id=0B4OqDVTQ1tMPTU5GVUFMcFE2LTg

CNN Referer Analysis

Instead of aggregating requests by domain, we can now see the resources responsible for the majority of the site’s requests. Not surprisingly, the base page (cnn.com, in this case) is often the main referer. But scanning the table reveals other expensive components, one of which is a resource loaded from z.cdn.turner.com. Expanding this referer reveals several requests to z.cdn.turner.com that we weren’t able to find in the page source:

https://docs.google.com/a/rigor.com/uc?id=0B4OqDVTQ1tMPN2R2TjZNSWcwQnc

Second referer

To make this new analysis even more powerful, we can search for all resources referering to or from z.cdn.turner.com. Any resource matching the search is either requesting additional resources from that domain or is hosted on that domain. Here is what our search results would look like using our same example HAR:

https://docs.google.com/a/rigor.com/uc?id=0B4OqDVTQ1tMPVHVBN0JjX2cxWms

Referer search

Using the power of HTTP Referer, we can now assign costs to each component we add to our site by seeing how many requests it makes. Instead of treating a new JavaScript library as a single resource, for example, we can now include all the dependent resources it requests in our cost analysis, giving us more insight into the cost of a given file.

To simplify this type of analysis, we’ve created a simple tool at insights.rigor.com. Simply upload a HAR file, and the tool will generate domain and referer reports to help you identify costly components. Next time you are adding resources to your website, consider using HTTP Referer to combat bloat and slow load times.