Do You Hear What I Hear?�Part IV: Measuring “Toll Quality”

Continuing with our series on VoIP Quality of Service (QoS) issues, recall that in our last few installments, we have looked at a definition of QoS, considered some of the transmission impairments that affect QoS, and examined the sources of latency, or delay, that is a major deterrent to voice and video quality, since it affects how the end users interact with their associate(s) at the other end of the connection.

Now take a mental break and roll back your time capsule to the last tradeshow or conference you attended where vendors were exhibiting their VoIP solutions. How many of these vendors were touting that their products produced toll quality voice? If your answer is close to 100 percent, I would not be surprised, as the objective of VoIP technology is to be at least as good as the Public Switched Telephone Network (PSTN). Not to mention all of the advantages of an integrated-converged-packet switched network-supporting voice data and video all in one fell swoop-with five nines of reliability solution that those vendors would like for you to purchase!

But how do you know if the vendors are telling you the truth? Is their system really as good as toll quality voice? For that matter, what do we mean by toll quality?

Fortunately, the International Telecommunications Union—Telecommunications Standardization Sector (ITU-T) has addressed these questions in two key recommendations that are available at their website, (As an aside, the ITU-T presently allows individuals to download three of their recommendations without charge, when you register at the ITU-T Electronic Bookshop. To take advantage of this offer, go to The first recommendation is titled, Methods of Subjective Determination of Transmission Quality, designated P.800, and the second is titled Perceptual Evaluation of Speech Quality (PESQ), an Objective Method for End-to-End Speech Quality Assessment of Narrow band Telephone Networks and Speech Codecs, designated P.862. (While you are at the ITU-T website, note some of the other standards, such as P.910, P.911, P.920, and P.930 that address audiovisual quality, but that’s a story for another time.)

For many years, the telephone industry has employed a very subjective rating system, known as the Mean Opinion Score, or MOS, to measure the quality of telephone connections. These measurement techniques are based on the opinions of many testing volunteers who listen to a sample of voice traffic and rate the quality of that transmission. In doing so, they consider a number of factors that could degrade the quality of transmission, including loss, circuit noise, talker echo, distortion, propagation time (or delay), and other transmission problems.

The most well known test, described in Annex A of P.800, is called the Conversation Opinion Test. The volunteer subjects are asked to provide their opinion of the connection they have just been using, based on a five point scale:

  Quality   Rating
  Excellent   5
  Good   4
  Fair   3
  Poor   2
  Bad   1

Since the test subjects are human, some variation in the scores is expected. For that reason, a large number of people are used in the test, and their individual scores are averaged (hence the term Mean in Mean Opinion Score). A MOS of 4 is considered toll quality within the telephone industry. So if a vendor states that their system achieves toll quality, ask them for their MOS , and shoot for a number of at least 4.0.

The companion standard, P.862, defines another quality measurement called the Perceptual Evaluation of Speech Quality, or PESQ. PESQ also addresses the effects of filters, variable delay and coding distortions, and it thus applicable for both speech codec evaluation and end-to-end measurements. The PESQ algorithm is fairly complex, but nevertheless produces a summary score between -0.5 to 4.5, with typical results in a range from 1.0 to 4.5.

In summary, the P.800 MOS test is a subjective evaluation, while those tests defined in P.862 are objective measurements, and can be implemented in hardware for greater accuracy. In many cases, both testing methods are implemented, to give a broader perspective of the overall quality of the voice system.

So now you know about toll quality, and what the various MOS results represent. In our next tutorial, we will continue the discussion of QoS issues, and begin to look at some of the technical solutions that are designed to achieve higher QoS results.

Copyright Acknowledgement: © 2005 DigiNet ® Corporation, All Rights Reserved

Author’s Biography
Mark A. Miller, P.E. is President of DigiNet ® Corporation, a Denver-based consulting engineering firm. He is the author of many books on networking technologies, including Voice over IP Technologies, and Internet Technologies Handbook, both published by John Wiley & Sons.

Latest Articles

Follow Us On Social Media

Explore More