Glossary of Terms

Website Uptime Monitoring

What is Website uptime monitoring?

Website uptime monitoring refers to monitoring the performance of a site or application’s uptime.

Website uptime, or uptime, is a metric used in performance monitoring to measure the amount of time that a site is “up,” “live,” or“ available” — essentially, the amount of time a website is working and accessible to users.

Website uptime and website availability are often used interchangeably. However, uptime is different from website availability. Uptime is the amount of time a system is operational, while availability is the percentage of time that the system is operational.

If a website isn’t up, it’s down — there’s an outage or other issues and customers can’t access a business’ website, application, or online store. Therefore, it’s a business - critical metric. It directly impacts a business’ productivity and profit.

Why website uptime monitoring matters to businesses

If a website isn’t live, then customers can’t visit it. It’s easy to understand how important that is to an ecommerce site, like Amazon. But, downtime is equally detrimental to the revenue of other companies too.

Software-as-a-service (SaaS)

SaaS companies, for example, need their applications to stay live. Otherwise, they risk breaching their service level agreements (SLAs). When a service level agreement is broken, the company in breach must pay a fine, which can exceed hundreds of thousands of dollars.

Ecommerce

Ecommerce stores are essentially “closed” when they’re down. It’s vital that ecommerce businesses focus on having as much uptime as possible.

Travel

There are huge travel sites these days, like Priceline, Booking.com, and Expedia, that rely directly on uptime for their revenue streams. They’re not only relying on the uptime of their own site, but the uptime of the other sites which they compile the best deals from.

What to monitor to improve website uptime

It takes more than just the home page of a website to keep the site live or “up” and working. The Internet of today is complex — the cloud and other complex structures and components are involved in the performance of sites and apps. Each component as critical to uptime as the next.

DNS

A Domain Name System (DNS) turns a website URL, like example.com, into a numeric address that computers and servers use to communicate. DNS is the first step in the journey of information from one machine to another. It’s literally the moment the user types a domain into the browser.

Observing DNS should be a priority in improving uptime. Without functioning DNS, users can’t reach a site or application at all. Imagine they need to physically travel to this business, but they can’t make it further than their driveway because they don’t have directions on how to get there. It would be a bad situation indeed.

CDN

Content Delivery Networks (CDN) deliver information to a user’s computer or phone. Because CDNs deliver content, the proximity of a CDN to a user impacts download speed.

CDNs stage content closer to end users which reduces latency and improves performance. The closer the content is, the faster it’s downloaded. A multi-CDN strategy provides coverage in multiple geographies for improved speed and performance. Proximity to users is important, but the chief benefit of multiple CDNs is having a failover in case there’s an outage.

Observing CDNs is vital to uptime. With multiple CDNs in place in various locations, IT teams must know which location is experiencing downtime. This time the user was given directions to a store that’s 30 miles away when there’s a store only 5 miles away. The user would usually want to go to the closer store, but in case there’s a road closed, they can trek out to the further store for what they need.

Servers

Servers store all the information on a website. If many servers go down, website CDNs can’t deliver any information. Thus, the site will likely go down.

IT teams need to monitor their servers so they can switch over to backups should there be an outage. Again, imagine the user physically journeying to the business location, but now instead of being stuck in their driveway, they’re stuck in the business’ parking lot - but the company’s building vanished.

Third parties and cloud providers

Third parties are the components of a site or application that aren’t owned by that company. These vendors can be an analytics tool, a marketing platform that integrates with a website or application, or any other software that isn’t owned by the company.

Third parties hosted via cloud are usually SaaS (software as a service), PaaS (platform as a service), or IaaS (infrastructure as a service).

  • Uptime can depend on SaaS components, or software hosted via cloud, like the analytics tools mentioned above or display ad software.
  • PaaS companies host hardware and software via cloud that developers can use to develop and deploy code. If a PaaS goes down, then certain elements of code may not be deployed to the website or other components of the website or application’s infrastructure.
  • IaaS companies provide virtualized infrastructure components via cloud. If an IaaS tool goes down, for example Amazon AWS, an entire site can go down if it’s depending on the cloud and the many intertwined pieces of AWS.

Other website uptime metrics to monitor

Downtime isn’t the only metric that negatively impacts end users. Modern sites and applications contain multiple functioning parts, like third-party credit card processors, analytics tools, microservices, and more.

Latency

Today’s users often feel that if a site is slow, it may as well be down. In response to that, some companies are putting latency into their service level agreements (SLAs).

Functionality

Just because a site is up doesn’t mean it works. For example, your site might work perfectly until someone tries to access it via mobile. Since the site isn’t mobile responsive, it’s impossible for users to add items to their shopping carts. Is that considered down?

What if the “add to cart” button isn’t working on a site. If users can’t make a purchase, is the online store down? Most would say, "yes."

Scheduled downtime

Sites and applications require routine maintenance. Some companies consider this routine maintenance downtime to be actual downtime. Others omit it from the metric objectives in their service agreements, as it’s part of keeping an application working properly.

The basics of a website uptime monitoring strategy

If a business can get ahead of a potential issue, or catch issues early on, then they can minimize repercussions. Observing uptime is essential to improving user experience and ensuring third parties meet their SLAs.

A strong uptime strategy must include active monitoring, real user monitoring, and a plan for managing service level agreements.

Active Monitoring

Active monitoring utilizes agents or nodes that mimic user behavior in tests. These tests are run 24/7 to detect downtime, outage, latency, and other important metrics.

Since the tests run 24/7, they help companies get ahead of potential threats to user experience. IT teams can set alerts to be notified when important thresholds are passed — for example, if many users are experiencing latency greater than 5 seconds or if downtime lasts more than X seconds.

Another example would be that if a server is experiencing high response times, a business would want to pinpoint which server it is. While passive monitoring points to the high response time, active monitoring can dig deeper and pinpoint the exact server responsible for the delays. Finding the specific server reduces downtime and ensures availability.

Real User Monitoring

Real user monitoring (RUM) collects data from real users of an application or website via performance monitoring software. Real user monitoring allows a business to preempt uptime issues early on, when they detect that a portion of users are experiencing problems.

SLA management

SaaS, PaaS, and IaaS companies have legal contracts with their customers. These contracts, called service level agreements (SLAs) outline the level of performance the application must meet. If the promised levels of performance are not met, the SaaS company is in breach of its contract. A breach results in fines paid to the customer.

Companies need to utilize both active and real user monitoring to ensure they’re meeting their SLAs, particularly the agreed-upon uptime percentage. Also, companies should observe their third-party applications so as to keep them accountable to their SLAs. For example, if a business application relies on AWS, it’s important the business monitor AWS to determine if AWS is a problem source and ensure AWS meets their SLA.

Conclusion

Website uptime observability is a vital piece to a company’s overarching digital experience observability strategy — for both SaaS companies and any business with a web presence. Only a 360-degree view obtained through a powerful digital experience observability solution will allow companies to get ahead of issues and fix them faster, so they can stay ahead of the competition.