Black Friday weekend has ended, and apparently, ‘tis the season to be failing.
Over the course of four days, the eCommerce industry saw a handful of its giants experience major outages. This kind of thing tends to happen every holiday shopping season—one or more high-profile sites crumble under the stress of so many shoppers as a result of poor planning.
Cyber Monday’s biggest mishap is a perfect example of the consequences that weigh in the balance of an ill-prepared eCommerce site during the biggest online shopping holiday of the year.
Retail tycoon Target made headlines Monday for all of the wrong reasons with an outage that left customers experiencing spotty performance. The outage prevented some customers from accessing the Target website, while others were faced with this message as they tried to add items to their shopping cart:
Some customers were also greeted at random occasions with this message during their shopping trip:
An abundance of traffic, or “customers in line,” should never be an acceptable excuse for a site failing during such a monumental shopping holiday because of the amount of time the company has to prepare.
Performance issues like the one Target experienced typically come down to capacity planning, or lack thereof. Load testing is only a small portion of capacity planning, therefore you’re left with a lot holes if you don’t cover your bases.
Capacity planning is a cross-departmental effort, including tasks like:
- Coding your site with performance in mind
- Compressing and optimizing images, utilizing tag managers, etc.
- Ensuring that there are enough servers
- Pre-compiling and caching static HTML is a must for heavily visited pages
- Deploying CDNs for all static content
It’s also important to overestimate your traffic predictions. Prepare for a traffic surge that is triple the size of your realistic expectations in order to accommodate any variable that occurs. Even having hot-standby servers ready to go online if necessary is a solid way to avoid performance issues that you face when traffic drastically increases. No length is too great for eCommerce companies to go to ensure availability and protect revenue during peak shopping holidays.
To Target’s credit, it appears that they did have a contingency plan in place to minimize the damages of an outage of this magnitude. They were able to catch the problem before their site crashed completely, and directly communicated the issue with their customers.
We’ve dedicated many blog posts in the past to the importance of failure planning, and while Target probably lessened the impact of this issue by having these error messages on standby, could there have been a better way?