Blog Post

Monitoring in the Cloud Era

Updated

Published

July 18, 2018

mins read

Matt Izzo

in this blog post

Heading 2

Companies migrating from a self-managed infrastructure to public cloud solutions, like AWS, Azure, Google, IBM, Alibaba, and Tencent, see significant IT savings. For Internet-based services, this brings a set of new opportunities and challenges in monitoring and managing customers’ digital experience.

How do you monitor performance (speed, reachability, reliability, and availability) of your digital services when you don’t have full visibility or control over the infrastructure?
How do you identify and localize problems when services are part of hybrid infrastructures or multi-cloud architectures with third-party SaaS components?
When does monitoring from within a cloud provider make sense, and when should it be avoided?

So You’ve Moved Your Application to a Public Cloud Provider

If you’ve moved internal applications from your own data center to a public cloud provider, then monitoring from within that cloud provider can serve as your first mile synthetic monitoring. This is good for testing basic functionality, performance, and availability close to the applications themselves, without the “noise” of the external Internet.

Continuously monitor your production environment from the first mile for a good baseline to compare with other vantage points at Internet backbone and broadband ISP points of presence. Remember that your applications may be in the cloud, but your consumers are not. If your apps and services are reachable and reliable from the cloud, that does not mean the same for your end users – unless, they too, are in the same cloud. You need a monitoring strategy that pinpoints whether a performance problem is internal, caused by your own update, an outside network issue, or third-party service component.

Monitoring a Multi-cloud Solution

If your application requires geographic diversity, you’ll need multi-cloud architecture. You may quickly find that no single provider covers all the markets you need. While there is a good amount of overlap in major areas, like London or Washington DC, coverage in smaller areas can be a challenge. This forces you to build a multi-cloud, multi-provider architecture.

Even if you don’t require geodiversity, [recent major outages](https://news.google.com/search?q=cloud outage&hl=en-US&gl=US&ceid=US%3Aen) have proven that the old adage “don’t put all your eggs in one basket” applies to cloud solutions. Just like many web-based services rely on multiple DNS and CDN providers, hosting your business-critical applications in more than one public cloud provider may be a necessity.

In these cases, it’s important to actively monitor your applications and services, from cloud to cloud, across regions and providers. You might discover that not all cloud providers deliver the same level of service under all conditions. As a result, balance your solution accordingly.

Monitoring Hybrid Cloud Solutions

It can be impractical, even impossible to host your entire infrastructure in the cloud. A hybrid solution includes a mix of infrastructure and service elements. The elements must work seamlessly across your local data centers and external cloud providers. Most commonly, your applications make heavy use of third parties and APIs hosted in their own cloud environments.

Another use case is tied to legacy applications within the architecture of your system or service. Banks and financial organizations that still rely on mainframes for core services have taken this approach. This gives them more time to migrate mainframes to newer technology stacks. Therefore, banks around the world have projects that start in the cloud, applications migrating to the cloud, and mainframes moving to datacenters closer to their cloud providers to reduce latency.

These are some of the reasons why hybrid cloud solutions are quite common. In these cases, communication between your in-house data center and cloud-based services is critical for delivering a quality customer experience. Use synthetic monitoring for both service and network between locations to identify performance issues at their root cause. The question “Is it the application, or is it the network?” becomes a major challenge in hybrid environments that can impact Mean Time To Resolve (MTTR), and more dangerously, hide end-user issues from your IT team.

Network routing and performance from different cloud providers to internal data center

Not Everything Should Be Monitored from the Cloud

Monitoring from cloud locations is vital for the use cases above. However, your human consumers are accessing your applications from local ISPs, while employees are in corporate facilities. Monitoring from the cloud does not monitor performance along the entire chain of delivery for users and employees.

For example, the following are not suitable for cloud-based monitoring:

SLA measurements for services delivered to end users or within the same cloud
SLA measurements for third-party providers or suppliers that are in the delivery chain (DNS, CDN, cloud providers, Adserving, etc.)
Supplier/provider performance testing and validation for services like DNS, CDN, cloud, SaaS
Benchmarking of consumer service delivery for your competitors or industry
Network/ISP connectivity monitoring and alerting
DNS availability, performance or validation of service based on geolocations

Make Sure Your Monitoring Solution Is Not Solely in the Cloud

Beware the trap of monitoring only from the cloud. Simply put, monitoring a cloud-based consumer service from within the cloud only measures the performance of the cloud itself, not the performance seen by end users.

Even if your app or service is 100% cloud-based, you still need to monitor from non-cloud locations. This includes monitoring from Internet backbone and broadband/ISP points of presence, in-house enterprise locations, consumer last mile locations, and mobile. These vantage points are essential for delivering a complete view of the digital experience you’re providing.

Monitoring the cloud is not just about the vantage points either. The way you collect, store, and analyze data is important as well. If you’re monitoring from the cloud and your cloud provider suffers an outage, you can’t afford to have your entire monitoring solution go down too.

Conclusion

Your strategy to provide the highest quality digital customer experience no doubt includes proactive monitoring. For cloud-based solutions (simple, multi, or hybrid), that monitoring needs to collect performance data along the lines of service delivery. This includes to/from cloud locations, in-house data centers, third-party services, and customer access points. A diverse monitoring solution with a broad set of monitoring vantage points and comprehensive cloud coverage is essential.

Catchpoint provides 111 Cloud Nodes deployed on the six major cloud providers: AWS, Azure, Google, IBM, Alibaba, and Tencent. Our Cloud Nodes cover 62 cities in 25 countries worldwide, the largest, regional cloud provider coverage available. As part of Catchpoint’s holistic monitoring approach, there are more than 700 vantage points on Internet backbone, broadband/ISP, last mile, and wireless infrastructure. It’s the largest and most diverse monitoring network in its category.

At Catchpoint, we built an entire system from the ground up that scales globally to deliver real-time insights. We collect performance data from cloud and non-cloud infrastructure and host our data in the highest rated datacenter in the US, Switch Supernap. This ensures your digital experience monitoring solution is available even if you’re cloud infrastructure is not.

Summary