Blog Post

Tips to improve website availability during traffic surges

Updated

Published

October 11, 2018

mins read

Mehdi Daoudi

in this blog post

Heading 2

When it comes to traffic spikes and website availability – there’s no time like yesterday to prepare. Just because high-traffic events are expected, like Amazon’s recent Prime Day, doesn’t mean you’re safe from failure.

In this post, we’re going to cover the best tips for preparing your site or application for traffic spikes – both expected and unexpected.

Infrastructure and website availability

When you’re preparing for a high-traffic event, you don’t want to neglect the backend of your site or application. You definitely don’t want to lose customers due to server overload or another preventable outage.

Run load and stress tests to make sure your servers can handle increased traffic. Deploy additional servers in case they can’t. Put a plan in place to deploy or spin up additional instances in case the traffic exceeds what you’ve prepared for.

And don’t stop with servers, make sure everything else can handle a heavier load too: check your network links and network equipment. Have a backup site ready (more on this below) that you can activate via DNS.

Make sure you check your monitoring solution for insights into how you can improve server performance, front-end code, SQL queries, databases, etc. And don’t forget to have backups ready that you can switch to in case of an outage.

CDN

Take advantage of content delivery networks and caching services to reduce latency and bottlenecks. CDNs will also improve your website availability and keep performance consistent based on your major points of presence (PoPs) or the location of your users.

Consider a backup CDN so you can switch over in case of an unexpected outage. This will keep your delivery consistent and prevent any significant changes to your user’s experience.

Web

You want to code your site for optimal performance at all times, but especially during traffic spikes. Reduce your bandwidth by compressing files (like HTML, CSS, JavaScript, and XML) – which can decrease file-size by 70% – dramatically improving download speed.

Enable Keep-Alives so that your consumers’ browsers don’t have to establish new connections continually. Make sure you limit your image size to 80 KBs – if your images are too big, try Image Optimizer or RIOT for resizing.

Simplify above the fold content and cut back on the content users will see as they scroll your pages. Add a search feature to the top of your page so that users scroll less often – this will increase users’ load times.

As of 24 July 2018, nearly 30% of the top 10 million websites support HTTP/2. If you haven’t already, you should consider the HTTP/2 switch. HTTP/2 will speed up your site’s download times, improve your search rankings, and so much more.

Mobile

You know that your site needs to be responsive. But, there’s lots more that goes into keeping the mobile version ready for a traffic surge or big event.

Eliminating unnecessary files, getting rid of images you don’t have to have, and decreasing image size will keep your speed and performance consistent on mobile versions.

TCP is not optimized for mobile, and it’s expensive to close connections on networks with high latency and packet loss. So, just like with web preparations, enable Keep-Alives so that your users don’t need new TCP connections all the time.

Error pages

We recently wrote about putting your error pages to good use. They can serve you so much better when they’re more than just error pages.

Since you want to retain business and not lose any new customers that visit during a spike, set up error pages that have minimal functionality. For example, you could create a light site that offers a few of your best-sellers – Amazon did this during the recent Prime Day outage.

Another great way to not lose a new potential customer is to set up an error page that has minimum functionality – like collecting a visitor’s email address in exchange for a coupon or discount on your products or services. Check out this example from Bonobos:

website availability bonobos example

Third parties

When it comes to an upcoming major event, you need to watch your third parties like a fast-paced M. Knight Shyamalan thriller – in other words – keep a close watch.

Your third parties may or may not be monitoring their entire infrastructure, so your best bet is to prepare for any of your vendors to go down.

You should use a reputable tag manager – this will help you wrangle issues fast, often before they’ve affected your customers. Make sure you know where your tags are and which third parties they belong to.

Keep tags as lean as you can during events – eliminate any unnecessary ad tags that you don’t need. If you’ve got to have the ads there, make sure they aren’t delivering Flash, video, or large image files.

Have backups for your third parties – and hold them accountable to their SLAs.

You can’t control your third parties’ outages, but you can control the way you communicate about them internally, and you can control how you communicate with third parties when something goes down. Know who’s responsible for communicating with specific third parties – make sure that person has contact info readily available.

Monitoring

When it comes to monitoring, first you need to know what you’ve got that might break. Dig in and figure out each piece of your infrastructure so that you can streamline and make a plan for how you’ll monitor each part.

There’s a lot more going on with applications than most are aware of. Make sure you’re monitoring every piece of your infrastructure – DNS, CDNs, third parties, cloud, backbone and last mile nodes, all of it. Monitor each point along the user’s path, for example, make sure to monitor from adding an item to the shopping cart through the checkout process.

Use real user monitoring (RUM) to get an idea of the true experience of your users. Use synthetic monitoring to test all the pieces of your infrastructure continuously. You can also use synthetic monitoring to test a higher traffic load.

Process

To handle any outage or issue effectively, you need to have planned processes for communication and procedures.

Know who’s responsible for what regarding internal and external (third party) communication.
Set goals and communicate about KPIs with your team in a daily meeting or standup.
Make sure the appropriate teams within your organization (like product, marketing, or PR) are aware of your communication plans. For instance, PR or marketing might need to break the news of an outage due to traffic to your social media audience.

Testing

There’s no better way to prep for an event than to test your entire site.

Test your code and functionality – especially the functionality of your most critical components like your shopping cart transactions or critical product pages.

If you’ve added any new products or pages to your site, be sure and rigorously test the new stuff. Make sure new stuff hasn’t introduced any new problems

Conduct a full-on test of the entire site and increase the number of virtual users that you’re testing with so that you can see how the site performs with higher amounts of traffic. Make sure you test your entire transaction process with simulated user interactions, including product searches.

In conclusion

When you’re preparing for a major event or traffic surge, leave no stone unturned. Check each part of your infrastructure and have backups prepared in case something goes down.

If you’re relying on lots of third parties, try to cut down, and make sure you have a plan for communication should one of your vendors fail.