If your digital service relies on holiday traffic, now is the time to check your Service Level Objectives (SLOs). If you don’t have them, or if you’re only using availability-based SLOs, then time is running out to set up the experience SLOs you need.
What's an SLO?
SLOs are the performance objectives you set for your service to meet its Service Level Agreements. Even if they’re internal and not guaranteed to customers, SLOs can be very helpful in measuring the quality of service you’re providing.
Why availability SLOs aren't enough this holiday season
Keeping an eye on your SLOs is essential for business, especially during the holidays. The National Retail Federation projected 2023 holiday spending will reach record levels of over $950 billion, representing 19% of annual retail sales. That means downtime in this period could result in significant losses for your business. And if you’re only using availability SLOs, you should seriously consider performance-based SLOs, which represent customer experience better than uptime. A bad user experience can be as harmful or worse to your business and brand reputation.
Increased holiday traffic may cause congestion and performance issues that don’t impact your service availability but can significantly harm your customer experience. Setting SLOs on performance metrics can give you an early warning of negative trends that impact your business long before you reach an alert threshold.
Availability SLOs vs Experience SLOs for monitoring your users’ experience
Below are two SLO burndown charts for the website of a major international manufacturer with an annual revenue of over $50 billion. A burndown chart shows an SLOs progress over time and whether it is on track to meet or fail its objective. The metrics (or Service Level Indicators) for these SLOs are measured by Catchpoint Internet synthetic testing.
The first chart is for an SLO of 99.5% availability per month. So far, in November, the site has had no downtime, so everything looks great for holiday business. The story is quite different, however, when you look at the experience SLO.
The second chart is the burndown for an experience SLO that’s defined for this service as 98% of web tests must complete within 7 seconds, a point where users tend to be frustrated and bounce. To avoid false positives, the violation condition requires 2 tests in different locations to miss the objective within 15 min.
You can see from the chart and list of violations that the website is performing well until November 8. After that, it begins to experience periods where the website exceeds the SLO criteria. Since the tests are run from Catchpoint’s Internet backbone nodes, they’re directly in line with actual users’ experience, which we can see is starting to degrade. By November 17, the SLO budget is gone and the SLO will not be met by the end of the month – well before Black Friday and Cyber Monday shopping has even begun.
Further investigation shows that the average web test time is only 3.8 seconds – well within the SLO definition and acceptable norms. However, we know that average metrics can be very deceiving and don’t represent what individual users experience. A more detailed look shows multiple spikes in test performance – more than enough to meet the violation criteria.
The above chart takes a closer look at the last 3 days. This shows performance spikes affecting several metrics commonly used to gauge user experience.
For example, where the average Time to Interactive reaches over 16 seconds, it’s unlikely anyone would wait for the webpage to be useful. Those users are more likely to bounce, resulting in lost revenue at a time when consumer spending is nearing its peak.
So, are your holiday experience SLOs in place?
It's obvious that monitoring your digital services is important over the holidays. What’s less obvious, but clearly shown above, is that availability SLOs are not always good indicators of the health of a service, and don’t represent user experience very well. For that, you should be implementing experience SLOs on performance metrics.
To ensure peak performance during peak periods, sign up for our Internet Resilience Program.