Our recent webinar on ecommerce performance management raised some really interesting and important questions around monitoring ecommerce applications to ensure exceptional user experiences during high traffic events. We featured experts from Akamai and Walmart who weighed in on rethinking monitoring strategies and discussed:
- How ecommerce companies prepare for high traffic events.
- How proactive monitoring can improve performance.
- Tips for peak event preparation.
Complex dynamic applications drive online businesses and application performance has a direct impact on business KPIs as it determines end-user experience. Performance monitoring tools play a crucial role is shaping end-user experience. These tools have evolved with performance management strategies. Application availability, reachability, reliability and performance are the central pillars of digital end-user experience monitoring.
The evolution of performance management has pushed the need for a more proactive monitoring mindset, especially when preparing for high-traffic events.
Ecommerce Performance Monitoring
The digital landscape is evolving continuously – everything from the way applications are built, the infrastructure to the skills involved in building, deploying and maintaining applications. Applications are no longer monolithic, it has transitioned to an architecture that relies on microservices as it provides a lot of flexibility in building and deploying applications. Teams can work exclusively on specific features within the application without impacting the overall application.
IT infrastructure has also evolved to support this new breed of applications. It has moved from on-premise to a multi-cloud, multi-CDN or even a hybrid model. Edge computing is the next level in this evolution process moving the spotlight to edge content delivery, networking, security and edge performance monitoring.
Even with all these changes, delivering a great customer experience remains the focus and application performance is central to maintaining the customer experience. The definition of good end-user experience has also changed over the last few decades. There was a time when a page that loads in 10 seconds was acceptable. Ecommerce giants like Amazon and others have redefined customer experience and now the acceptable page load time is below 2 seconds.
Considering the current highly distributed and complex architecture, there is a pressing need to rethink performance monitoring to provide insightful data and analysis. Performance monitoring is crucial as it –
- Mitigates impact on revenue
- Every 1 second of performance improvement increases conversions by 2%
- Every 100 ms of performance improvement, grew incremental revenue by up to 1%
- SEO benefits for entry pages and reduce bounces
- Protects your brand value
- Downtime costs $8,000/minute – roughly $800,000 per incident
- Downtime can have a significant impact on brand value
- Saves IT productivity
- IT spends less time firefighting performance issues
- You can focus better building, deploying and marketing your products and services.
Preparing for High-Traffic Events
When prepping for a peak event such as Black Friday, there are four important phases to consider:
- Building effective strategies: Consider a multi-tenant application architecture that is resilient and delivers better performance. Invest in caching and failover strategies.
- Cache offload to CDNs.
- Improve Cache hit on CDN and tiered cache proxies.
- Revisit cache-busting scripts.
- Failover Implementations
- DC/Cloud-based delayed failovers.
- Application level failovers.
- Improved customer experience via wait room implementations.
- Preparation and testing: Stress test the application to understand how performance varies with different amount of traffic to identify bottlenecks in the application. This involves testing all third-party services including the CDN provider as well as the monitoring tools you use.
- Implement performance monitoring: Single pane simplified dashboards that give a clear picture of the entire delivery chain. Log aggregation intervals must be consistent across different layers in the infrastructure.
- Alert configuration: Identify and set up relevant alert types, alert severity and map alerts to the right team to ensure it is addressed immediately.
During the event ensure you have support team on call and ready to take action when needed. You must discuss guidelines defining escalation policies as part of the prep. This will make it easier for different teams to communicate any critical issue without delay. There should also be an effective plan of action in place to handle a crisis.
Once the event is over, it is important to conduct a retrospective analysis of the performance data and incidents during the event. This helps to understand –
- Did everything go according to plan?
- What could have been done better?
- How did the infrastructure handle the traffic and load behaved?
- How does the stress test data aggregated during the event prep compare to the post even data?
The performance data can also be used to benchmark different metrics that will help you prepare better for the next peak event.
Proactive Monitoring for Improved Performance
There are multiple third-party infrastructure and service providers in the industry. The adoption of services such as multi-DNS, multi- and hybrid-cloud, multi-CDN, and others mean that when everything is operating smoothly, end users are getting content and services delivered to them faster than ever before.
However, these developments in architecture and digital delivery come with a cost. Every additional layer in the delivery chain adds complexity, introduces visibility gaps, and reduces these teams’ ability to understand how infrastructure health is affecting the end-user experience. This means that whenever there is a disruption in the infrastructure delivery chain, IT teams are often left scrambling to identify the root cause of the problem.
Proactive monitoring essentially eliminates the blindspots created by all the different components in the delivery chain. Root cause analysis is easier as the IT teams can correlate and analyse data effectively. The performance data will help identify and resolve bottlenecks easily. Third-party integrations can be monitored, and you can hold service providers accountable for any SLA breaches. Proactive monitoring is especially useful during A/B testing as you can evaluate the performance of each component.
Tips for Better Preparation
Proactive monitoring is a must when preparing for high traffic events. We suggest a five-step process for effective and improved performance monitoring:
- Measure everything: Latency or downtime can be introduced at any layer in the application. Critical endpoints, microservices and tag management tools are all potential bottlenecks. Monitor every single component including every third-party so there is end-to-end performance visibility. Measuring real user performance is recommended along with synthetic to help correlate performance trends and understand user behavior.
- Benchmark: Benchmarking is essential to performance monitoring. It helps you understand industry best practices. You can evaluate multiple service providers and identify those with ideal performance. The trends from benchmarking provides interesting insights to help improve performance of your application.
- Establish a baseline: A performance baseline is the expected performance of an application/service under certain conditions. With this information, we can determine –
- The expected performance will be when there is a surge in traffic.
- How to scale our application and services?
- How a new version of the application/service is performing compared to a previous version.
- By baselining data, we will learn to:
- Look beyond averages and understand percentiles.
- Look at historical data and analyse trends.
- Identify optimization areas: There are hundreds of performance metrics but measuring every single metric does not help. Each performance scenario calls for a set of metrics relevant to that scenario. It is easier to understand and correlate the data without having to pore through unnecessary information. So, identify areas that need optimization, focus on the optimization methodology while picking only the metrics that matter.
- Business KPIs: When trying to improve performance, start with the business KPIs, look at historical data trends/patterns and then the metrics that impact these KPIs. You can then focus on performance budgets to build process to ensure focus on performance across the project lifecycle.
These five points are essential when prepping for a peak event to ensure great end-user experience. We must remember that no matter how great the tools are, they’ll count for little if organizations don’t have visibility into the health and reliability of each of the pieces that make the whole application.
To conclude, we believe that performance management must be viewed as a year-round priority and the performance strategies you have implemented should help you:
- Gain performance visibility.
- Analyse and learn from the data.
- Implement changes and improve consistently.
Watch the webinar for a comprehensive explanation of everything discussed in this blog post.