Subscribe to our
weekly update
Sign up to receive our latest news via a mobile-friendly weekly email
So how do you test and benchmark CDNs to make sure they deliver on their promise? This is not so much about the toolset - it’s about the methodology
I recently came across a very interesting post about CDN testing by Jonathan Klein, someone I respect tremendously in the Web Performance world. During his tests with Webpagetest he was unable to “see” the value provided by the CDN on his site.
I wanted to follow-up on his blog and share some of the knowledge we have acquired both from our experiences at DoubleClick (building our own CDN in 1997, then using various commercial CDNs like Speedera, Akamai, then using a combo Internal + Akamai until Google’s acquisition and switching to the Google stack) and now with Catchpoint constantly monitoring CDNs.
The basic promise of the traditional CDN product: Better / Faster Web Performance for your End-Users by serving static assets from the Edge! (Note: CDN companies offer additional products like Site Acceleration, which goes beyond serving static files)
In theory they can deliver on this promise because:
In other words they leverage economies of scale to provide you with a service that is faster, better, and more cost-effective than what it would be if you relied on your own infrastructure.
So how do you test and benchmark CDNs to make sure they deliver on their promise? This is not so much about the toolset, some will say that Backbone monitoring is not accurate; it’s about the methodology. A benchmark is a benchmark, if there are issues just adjust the lens & zoom in you will see the differences and problems.
The testing period is very important, make sure you test for a long period of at least 1 to 2 weeks! And test as frequently as possible! (I was recently involved in multi CDN benchmark where a company used 1 Million + data points from Catchpoint alone, other tools were used).
Before you start your tests, please make sure the DNS TTL for all URL benchmarked are the same. For example you cannot have a TTL of 60 seconds for cdn.test.com and a TTL of 3,600 seconds for origin.test.com and a TTL of 300 seconds for cdn2.test.com.
Please make sure you are comparing the same file size.
The testing methodology I have used and have seen others use relies on a 2-phase approach:
Use an external performance tool (Synthetic Backbone or Last Mile, RUM, WPT…) to load 2 pair of files of various sizes (5kb, 10kb, 50kb, 100kb, 500kb…) from both the CDN URL and the Origin URL.
Phase 1 Key Take Aways:
Create a custom html page based on your existing webpages. Place CSS files, JS files and Images but NO third parties (ads, widgets, tracking, etc), something that matches your setup (CDNs cannot do anything about 3rd parties). Same as the previous phase, Page A will hit the CDN and Page B will hit your Origin. Your CDN will make your site faster but cannot make the ads load any faster. Monitor and Measure both pages using the same tools as in phase 1.
Phase 2 Key Take Aways:
– DNS time. Some CDNs have more complex DNS setup than others and can slow things down. What I have seen is the time gained in Wait time was diluted by slower DNS response time.
Keep in mind that DNS performance from the last mile, end user, is quite different from the tests run in the backbone. End users rely on DNS resolvers of their ISPs or Public Resolvers. Backbone monitoring relies on resolvers that very close to the machine running the tests.
DNS Lookup time of Various CDNs (US only)
– Connect time: This is to make sure your CDN has great network connectivity, low latency and no packet loss. Additionally you want to make sure it does not get slower during peak hours and they are routing you to the right network peering. Example if an end-user is on Verizon FIOS there is no reason to go through 5 different backbone networks because that CDN does not have a direct peering with Verizon.
Connect Time of Various CDNs (US Only)
– Wait time: This metrics is important when looking at various CDNs, it helps you see if your content is hot on the edge or that does edge needs to fetch it from the Origin servers. The Wait time is also an indicator of potential capacity issues or bad configuration at the CDN level or Origin server (for example setting cache expiration in the past). A CDN will deliver different performance if an asset is hot, requested 100,000+ times in the past hour vs. a few times an hour. A CDN is a shared environment where more popular items are faster to deliver than others, if something is in memory it’s fast, if it has to hit a spinning disk it’s a different story. Thus I would personally consider having Solid State Drives as criteria in my CDN selection.
Wait Time of Various CDNs (US Only)
– Throughput: Make sure that the throughput of the CDN test is higher than the origin no matter what the file size is!
Response time & Throughput of CDN vs. Origin (US + Canada)
Response time Vs. Throughput of various CDNs (US only)
– Traceroutes! You need to run traceroutes from where you are monitoring to make sure you are not mapped to the wrong place. Many CDNs use commercial geo-mapping databases and the data for the IP could be wrong. From my Time Warner home connection in Los Angeles, some CDNs sent requests to UK (at times).
– Most CDNs will give you access to a control panel so make sure you monitor your Cache Hit / Miss ratio. How often do they have to come back to the Origin! A good CDN architecture should not come often to the Origin. We disqualified various CDNs at DoubleClick because they would not agree to our miss ratio SLAs. You have to also ask questions about what happen when an edge server does not have that content? How long does it take to purge a file? How long does it take to load a file to the edge? How long before a Cname is active?
– How well the CDN handles public Name Resolvers such as OpenDNS, Dyn, Google? These companies are carrying more and more of the DNS traffic and this could impact certain CDNs geo-load balancing algorithms.
– Are the metrics from the CDN consistent? DNS, Connect, Wait and Response (Please do not just look at averages), remember Great performance is more than speed, its reliability or ability to deliver a consistent experience.
Major CDN vendor’s Response Time by Hour of Day (US Only)
Major CDN’s Response time – Different Statistical Models (US Only)
So after doing all these tests, the simple questions that must be answered are:
Now beside speed, CDNs do bring other benefits that are not measured in seconds:
And once you have selected a CDN, and are up and running on a CDN platform, keep an eye on them, always monitor a file from the CDN and the Origin at all times. Another observation I can share with you is a CDN is not a fire & forget technology, you have to stay on top of them, make sure the configuration is up to date, that the GZIP is always on… I have been on many interesting calls where I unfortunately hear a Sales Engineer from a CDN company say “oops, we forgot to turn that on after our last release” or we need to tweak your “map”…”.
I welcome comments, suggestions and tips to help create a common knowledge base about CDN benchmarking. I am also looking forward to the RUM data that Jonathan is going to publish!
Mehdi – Catchpoint