Introducing Internet Performance Monitoring: How does it help?
In part 2 of our ITOps Times Internet Resilience series, we dive deeper into the world of Internet Performance Monitoring (IPM) with Gerardo Dada, Catchpoint CMO.
Catchpoint and ITOps Times break down 6 critical topics you need to understand to ensure Internet Resilience for your business in this bi-weekly microwebinar series, each lasting less than 10 minutes.
Explore each of the topics in the series:
- (This Post) Introducing IPM: How does it help?
- 5/18 @ 1 pm ET – How can companies improve Network and API performance?
- 6/1 @ 1 pm ET – How can companies improve their employees’ digital experience?
- 6/15 @ 1 pm ET – How can you improve conversions on your website?
In this second installment, we’ll dive deeper into the world of Internet Performance Monitoring (IPM). Learn how IPM can help you proactively find and fix issues in your Internet Stack before they impact the business.
Now, let’s get into the episode!
Introducing Internet Performance Monitoring: How does it help?
Watch the live Q&A with Gerardo Dada, CMO of Catchpoint, or read the video transcript below.
Hello everyone. Welcome to the second installment of the ITOps Times Live Microwebinar series on Internet Resilience. I'm Dave Rubinstein, editor in chief of ITOps Times. And, just to recap the last time we were here, we were discussing Internet Resilience and why organizations should care about it. This time, Episode 2, we're going to be introducing the concept of Internet Performance Monitoring and how it can help ensure resilience. With me today is Gerardo Dada. He's the chief marketing officer at Catchpoint and welcome to the show.
Thank you, David. Pleasure being here.
Yeah, sure. So let me ask you, last week, we spoke to your CEO, Mehdi Daoudi, who was talking about the importance of the Internet and how complex and fragile it can be. So I think we should introduce this time talking about the importance of having visibility into the Internet to prevent disruptions before they occur.
Right. Mehdi spoke about how every business and every interaction depends on the Internet nowadays, right? I have teenage girls, and if the Internet goes down in my house, it's only a matter of seconds until I hear about it. I need to do something about it.
The bottom line is the Internet is your new network for employees. Everybody seems to be remote nowadays and applications have become more and more distributed. I used to work for one of the leading vendors in network performance management, and the fact is, companies like Catchpoint don't even have a network anymore. We don't have routers because we don't even have an office where we'd put the routers, and of course, larger companies will still have them for many years. They're still an important investment.
But monitoring the Internet is something that needs to be done in most companies because that's where a large percentage of the incidents happen. And a key thing to think about is that if you consider the outages for large companies like Amazon and Microsoft and Facebook who have a very deep bench and skills and tools on monitoring and networking etcetera - they still suffer outages.
The question is not if it's going to happen to you - it's when is it going to happen, how bad is it going to be, how much is it going to cost? And really the most important question is, are you able to catch it quickly and fix it before it creates a massive problem? In order to do that you need to pay attention to what we’re calling the Internet Stack. And look, this diagram is not necessarily 100% accurate, we’re just trying to depict the idea that there are many things inside the Internet that need to be monitored. It can be organized logically and there are groups of things that are codependent. Cloud services depend on Internet core services, there's a set of network technologies, there's a set of protocols. All of those things need to be paid attention to.
Any network manager knows that if your DNS is not working properly, nothing works, right? Few people know, for example, what BGP is – the border gateway protocol. Facebook learnt this lesson during the outage that everybody knows from a year and change ago which had Facebook, WhatsApp and Instagram down. It cost the company hundreds of millions of dollars because they were not paying attention to BGP. They didn't understand the importance, how to monitor and how to detect any route leaks or any changes in that routing protocol.
To pay attention to all those sets of technologies, we need a different set of tools than what companies have been using traditionally. Yeah, of course many tools have a little bit of context on DNS. Some of these protocols you might be able to monitor with other tools. Some of the network management tools could be useful, but nobody wants to be with an incomplete view. Again, Facebook was not aware of what was going on.
I was with one of the leading financial services companies and they said our website was down for an hour and 25 minutes. For over an hour, we did not know it was a DNS issue. The other day just last week we saw one of the leading hotel chains in the in the world - their SSL certificate expired. So, for a couple hours you could not do anything [on their platform]: reserve hotels or check your reservation or do any of any of that.
These are the kind of simple problems that companies are failing on. They don't have the right tooling to make sure they have the visibility required to know what's going on, and when something does happen, getting those early signals so they can fix the problem before it becomes a major incident.
I was looking at Southwest Airlines; I think they halted operations yesterday again after the problem they had on Thanksgiving, and it highlights the importance of catching things in time. If Southwest had to stop operations for 5 minutes, nobody would know. A 5-minute delay to your flight, I would say it's a good flight, right? But when it goes from 5 minutes to a couple hours, it starts creating a massive disruption.
Let me ask you, how is this different from what we used to see with a lot of companies that were running like NOC to monitor their network and stuff like that? Were they able to look into all these layers or is this something that has grown as the Internet has kind of grown out?
Well, I think it's a different concept to NOC, right, because if you look at the majority of the network management, NPMD as a Gartner likes to call them, tools out there, they start by monitoring devices through an SNMP protocol, right? They look at traffic flows, they look at your local network. Anything that’s outside is outside their core competency.
At the same time, a lot of these companies have application management tools. When we talk about the stack, the first thing that might come to mind is not only the seven layers, all the side layers, networking, but also the application stack, right? And there are really specific tools for that. Like this is just one way of thinking about the layers in a stack, you can also think about the traditional 3 tier architecture or many other ways, especially now that they have, you know, microservices and Kubernetes and all those technologies.
You have NPM tools to look at traditional networks, you have APM tools to look at traditional application stacks, which are still essential. But then you need a different set of tools to look at this Internet Stack, right? So, and that's what we call Internet Performance Monitoring (IPM). The reason why those things are important is I don't know any NPM tool that even understands BGP. Yes, you can get some raised data from the public networks, but that's not real time data. You don't want to be 15 minutes behind reality or diagnosing, and then 15 minutes behind to see if your fixes work. But it requires also a specific infrastructure inside the Internet infrastructure, specific nodes.
At Catchpoint, for example, we have nodes inside the BGP peers, inside the ASNs, inside the wireless providers, inside the cloud networks. In fact, the cloud providers and the CDNs, all the large ones, use Catchpoint to monitor their network. So, when you think about Cloudflare, Fastly, AWS, Google and Azure, they all use Catchpoint to make sure their services are up and that they get these early warning signs.
Our nodes are inside all of that infrastructure globally, across almost every country you can think of. That gives you visibility that you're simply not going to get with other tools. In fact, a lot of the tools that companies use nowadays even for APM are hosted on the cloud. If your tools are hosted on Amazon AWS, and AWS goes down, you cannot use your tools. You cannot get alerts. You cannot use any technology. So, you're dependent on those.
We made a deliberate decision to have an independent system, so you can still be up and still be in control when those services go down.
Excellent. All right. We're almost coming up to time, but I wanted to know if you had any kind of summary thoughts or final take away for the folks attending here today.
So, the idea first is APM plus IPM, you could also say plus NPM. This is a new technology in the ascendancy that every company and every application needs nowadays. Companies need to balance the investment in time, skills and people between traditional network monitoring, application monitoring and internal monitoring. This is what we call Internet performance monitoring.
The goal precisely is to catch any issues in that Internet stack that could impact your business, such as your application, your customer, your employee, your network or even your website experience. That's what we do at Catchpoint. We're not the only company, but this is our sole focus, and we spend every hour in the day thinking about IPM and how we can help organizations catch issues in their Internet stack.
All right Gerardo Dada, thank you so much for your time today. Really appreciate it.
Thank you, David. Have a good day. And thanks for all of yours.
Thank you. I just want to let our attendees know that they can tune in for episode 3 on May the 4th, where we'll be expanding a little bit more on the conversation about how Internet Resilience can help eCommerce companies drive more revenue. So that'll be May 4th. 1:00 PM Eastern Time for episode 3 of this ITOps Live Microwebinar series. So, for all our attendees, thanks for showing up. Thanks for giving us 10 minutes of your time and until next time I'm Dave Rubinstein, editor in chief of ITOps Times. So long for now.