Blog Post

Robots.txt can help in performance monitoring

When looking from the outside in and lacking insight into the internal application performance, monitoring Robots.txt can give quick answers.

One of the biggest challenges for online companies is troubleshooting an unexpected slowness of their web site or application in highly distributed architecture. Pinpointing the exact root cause of the slowness is like looking for a needle in a haystack.

Looking at a typical large scale web site infrastructure you have various layers where a problem can occur: ISP, Network devices (Routers & Switches), Load Balancers, Hardware, Web Servers, Application Servers, Database, Backend applications and services, and lastly the tens of firewall layers protecting the system.

You add the complexity of the page, 3rd party providers, JavaScript code – and it becomes extremely hard to determine what is impacting performance when you are looking from the outside in.

Recently, we had the opportunity to help a company troubleshoot such a performance issue. They started monitoring their homepage with Catchpoint using IE8 and Chrome agents and observed abnormal slowness and variability.

Web performance chart

From the chart we can clearly see that the time to load base HTML is quite high, it increases over time and it than quickly drops. Clearly the problem was not with the content of the page, or any 3rd party providers. The browser was spending about 40% of the time trying to download the HTML from the server.

With a couple of clicks we narrowed the problem down to the Wait time – the time it takes the client (browser) to get the first byte after the TCP connection is established. The metric shows how long it takes for the web server + the app server + database/backend to process the request. However, when you measure this metric over a wide area network (say the Internet) it is impacted by network connectivity and therefore is not always clear if it is the network or the application stack.

Wait Time for Homepage

Wait Time for Homepage

One quick way to remove the impact of application performance from the picture, is to measure the response of an HTTP request served by the same server, over the same network, which is not handled by the web application. This is where my friend “Robots.txt” comes to play, as most websites have this tiny file on their web servers and is not handled by the application layer.

We monitored the “robots.txt” performance and compared its wait time to that of the homepage test. If the issue was the network, or the load balancer, or the server itself – we would see a correlation of the data. If it was the application layer, we would see no correlation.

Wait time for Homepage VS. Robots.txt

Wait time for Homepage VS. Robots.txt

As you can clearly see the problem was not caused by the network, or load balancer – but the application itself. After further investigation of the application code the client was able to determine the cause and solve the problem.

When looking from outside in and lacking insight into the internal application performance, monitoring Robots.txt can be a quick way to figure out if your application code is the one at fault.

Mehdi – Catchpoint

Synthetic Monitoring
Network Reachability
SLA Management

You might also like

Blog post

How to improve website performance with multi-dimensional data

Blog post

Mythbusting IPv6 with Jan Zorz, and Why IPv6 Adoption is Slow

Blog post

Incident Review: Another Week, Another AWS Outage

Blog post

Are your Holiday experience SLOs in place?