Subscribe to our
weekly update
Sign up to receive our latest news via a mobile-friendly weekly email
With Catchpoint's new custom monitor, we can now capture packet loss, jitter, and RTT to measure the quality of an audio session over SIP.
The pandemic has changed the way teams collaborate within an organization and between companies. With work from home becoming the new normal, employees are turning to new options for collaboration, meeting, training and onboarding have moved online. The office is now a virtual space. With the increasing demand for online meetings, it is even more important to monitor the health and performance of such meetings.
Voice over IP (VoIP) technology is responsible for delivering voice and multimedia sessions over the internet. There has been amazing advances in this field that have improved audio quality over the internet. Even with all advances, packet loss still remains a major concern for performance degradation. With Catchpoint’s new custom monitor, we can now capture packet loss, jitter, and Round Trip Time (RTT) to measure the quality of an audio session over SIP. These metrics are used later to calculate Mean Opinion Score (MOS) for VoIP SIP call.
In this blog, we discuss the technology, protocols and metrics that the Catchpoint custom monitor uses to measure audio quality.
Most organizations are switching from traditional phone systems to VoIP. It has a whole set of advantages when compared with traditional phone systems. First, the call rates are lower, and the rates do not increase when calling long distance. Second, it allows us to integrate different media types like document, image, and video along with the audio.
The media transported over the internet are encoded with one of the multiple available audio codecs and video codecs. Codecs convert the audio speech signal to a digital encoded signal. The video codec breaks the video into small chunks which are then compressed using an algorithm. Each codec has its own advantage and can be selected based on the use case, some implementations rely on narrowband and compressed speech, while others support high-fidelity stereo codecs. In recent years new hardware and algorithms are being designed to handle packet loss and improve voice and video quality. The VoIP data is transferred over Real Time Protocol (RTP) .
RTP protocol was designed to provide real-time media over the IP network, it runs over UDP and at the transport layer. RTP packets are used when there is media transfer over the internet. The advantage RTP packets have over regular UDP packets is that it has a sequence number and a timestamp. The sequence number allows us to organize the packets in a specific order with a timestamp to recognize when the packets were generated. This helps to rearrange the packets when they arrive out of order at the destination and to identify any missing packets.
Even with all these mechanisms in place, RTP data transmission suffers from packet loss and jitter. The Catchpoint custom monitor enables the capture of performance metrics during RTP transmission. These metrics can then be used to alert performance degradation. The metrics captured are:
When scoring an audio session, the max score of 5 is never provided. The MOS is calculated with the help of these metrics – Round Trip Time, jitter, and packet loss. The formula for calculation is from pigman.com. The image below illustrates the formula and calcualtion.
The data sent over the IP network is controlled by signaling protocols. A signaling protocol establishes, maintains, and terminates a call. There are multiple signaling protocols one can use, in this case, we rely on Session Initiation Protocol (SIP).
SIP is an application-layer control protocol that can establish, modify, and terminate multimedia sessions (conferences) such as Internet telephony calls. SIP can also invite participants to already existing sessions, such as multicast conferences. Media can be added to (and removed from) an existing session. Read more about SIP here RFC 3261 – SIP: Session Initiation Protocol.
To capture the complete picture of the SIP performance, we have added 4 metrics. These metrics capture how much time was spent at different stages of a SIP session and helps to answers important questions like how much time was taken to connect or disconnect.
The Four metrics captured for SIP are listed below, we can reference fig 3 to understand each.
The custom monitor relies on SIP Simple Client SDK, find more details here. A sample audio session performance metrics are charted in the dashboard below (Fig 4).
With a total of 8 different metrics to capture the performance of SIP and RTP, the custom monitor provides a comprehensive picture of a voice call. It helps to understand and analyze any performance degradation. The metric can also be used to trigger alerts when certain thresholds are met.
To help you get started with custom monitors, detailed instructions and codebase are available on GitHub.