Listen to call recordings to hear the difference between calls from other providers, CarrierX’s direct connections, and high-definition AMR-WB calls.
The de facto standard audio codec used in telephony, G.711 μ-law, was adopted more than 50 years ago and is still in use. Send a call to the least cost route, or perhaps multiple downstream LCRs, and your call gets encoded, decoded, re-encoded, and so on. The audio quality of calls can be noticeably poor. With widespread use of mobile phones and internet communication, users’ expectations have moved beyond the tinny sound of yesterday’s telephone calls.
HD Direct eliminates poor quality audio with direct connections to on-net carriers and high-definition audio.
In telephone calls, codec quality directly affects the clarity and intelligibility of spoken words. Higher-quality codecs result in clearer and more natural-sounding speech, contributing to more effective communication. Higher-quality codecs preserve more details and nuances in audio, providing a clearer and more immersive user experience. In contrast, lower-quality codecs may introduce compression artifacts, reduced dynamic range, or other distortions, causing speech to sound muffled or distorted, leading to a less satisfying experience. Each time a call is encoded, due to multiple TDM hops, the quality and user experience suffer more.
Overall, user satisfaction with phone calls is closely tied to codec quality. High-quality audio contributes to a positive user experience, while lower-quality audio may result in frustration, dissatisfaction, or a perception of subpar performance.
Audio quality significantly influences the performance of AI, especially in tasks related to audio processing, speech recognition, and natural language understanding.
Real-World Applications
Language Understanding
Feature Extraction
User Interaction
Accuracy of Speech Recognition
In practical applications like automatic speech recognition (ASR) systems, high-quality audio is essential for reliable and accurate performance. The success of these applications relies on the model's ability to understand and respond appropriately to user input.
Mean Opinion Score (MOS) is a measure of the quality of sound (and potentially other stimuli), and is generally a subjective measure of performance quality. MOS is measured on a scale of 1 – 5 with 1 being the worst and 5 being the best. A person listens to a sound and assigns a value based on how they experience the quality.
It is possible to get an objective measure of perceived audio quality using tools such as ViSQOL from Google. ViSQOL uses machine learning to compare a degraded audio sample with an original reference recording, producing a MOS-LQO (Mean Opinion Score - Listening Quality Objective) score.
To evaluate the performance of HD Direct, we recorded calls from:
We played multiple high-quality recordings into the calls, recorded the results and used ViSQOL to compare the recorded calls with the original high-quality recordings.
Calls over the PSTN produced MOS scores between 2.1 and 2.7.
Calls over CarrierX direct connections using standard definition yielded scores between 2.7 and 3.3. Comparing calls over the CarrierX direct connections using standard definition to calls over the PSTN, CarrierX Direct performed between 10% and 36% better. On average, calls over CarrierX direct connections performed 21% better than calls over the PSTN.
HD Direct calls using AMR-WB scored MOS scores between 3.9 and 4.3. Calls via HD Direct improved upon PSTN calls by between 50% and 99% with an average score 67% better.
Human speech typically ranges from 100Hz to 17,500Hz. While μ-law can make speech understandable, the quality is only passable with a frequency range of 300Hz - 3,300Hz. This means that, in addition to potentially introducing compression artifacts, a common phone call loses much of the subtlety and rich tonality of the speaker.
AMR-WB has a frequency range of 50Hz - 7,000Hz, greatly enhancing the quality of the sound of a call, enhancing caller satisfaction.