What are Thresholds for Good and Poor Network Packet Loss, Jitter and Round Trip Time for Unified Communications?

With Skype for Business and Microsoft Teams, we know that having a “good” network is important for user experience, but what defines good?

At Modality Systems we have a Diagnostics product that reports on Skype for Business network performance and we primarily look at average packet loss, jitter and round-trip time.

Packet Loss This is often defined as a percentage of packets that are lost in a given window of time. Packet loss directly affects audio quality—from small, individual lost packets having almost no impact, to back-to-back burst losses that cause complete audio cut-out.
Inter-packet arrival jitter or simply jitter This is the average change in delay between successive packets. Most modern VoIP software including Skype for Business can adapt to some levels of jitter through buffering. It’s only when the jitter exceeds the buffering that a participant will notice the effects of jitter.
Latency This is the time it takes to get an IP packet from point A to point B on the network. This network propagation delay is essentially tied to the physical distance between the two points and the speed of light, including additional overhead taken by the various routers in between. Latency is measured as one-way or Round-trip Time (RTT).

Different codecs deal with imperfect networks better or worse, modern codecs like RTAudio and Silk dealing better with network issues than older codecs like G711. You should consider that in an SfB environment you’ll be using different codecs in different scenarios. You can argue over exactly what level of the above metrics is “good” or “bad” depending on the codec and the tolerance of the user, ultimately there are no definitively correct or incorrect thresholds as long as they are used to find issues and improve network performance. Getting hung up on exactly what is a “good” vs “poor” threshold is less important than finding and correcting issues.

What does Microsoft Define as Poor?

In QoE (The SfB Server Quality of Experience session performance database), going over one or more of these limits gets your session marked as “ClassifiedPoorCall”

Column in
AudioStream Table

Condition

Explanation

DegradationAvg

>1.0

Network MOS Degradation for the whole call. This metric shows the amount the Network MOS was reduced because of jitter and packet loss

RoundTrip

> 500

Round trip time

PacketLossRate

> 0.1

The packet loss rate

JitterInterArrival

> 30

Average network jitter

RatioConcealedSamplesAvg

> 0.07

The average ratio of concealed samples generated by audio healing to typical samples

So that’s over 500ms RTT, over 10% average packet loss and over 30 ms Jitter. Reference Jen’s great post here.

Personally, I think these are quite high, but there is little doubt if you are hitting 10% packet loss or over 500ms round trip.

What is Good?

More recent guidance from Microsoft for SfB Online Performance from the client to the Microsoft network edge recommends the following for optimal Skype for Business media quality,

It is interesting in this guidance the RTT must be below 100 for “optimal” performance, Packet loss at under 1% for any 15s interval (so effectively under 1% average) and Jitter under 30ms


Metric	Target
Latency (one way)	< 50ms
Latency (RTT or Round-trip Time)	< 100ms
Burst packet loss	<10% during any 200ms interval
Packet loss	<1% during any 15s interval
Packet inter-arrival Jitter	<30ms during any 15s interval
Packet reorder	<0.05% out-of-order packets

The Lync Server Networking Guide from Microsoft (Lync_Server_Networking_Guide_v2.3.docx –a great detailed document), recommends the following thresholds:

Packet Loss: On any managed wired network link, a packet loss threshold of 1% is a good value to use to find infrastructure issues
Jitter: On a managed wired link, you should investigate jitter above 3ms

Thresholds formed around jitter values to determine whether audio is good or poor can be very misleading. This is because most modern VoIP software can adapt to high levels of jitter through buffering

RTT: Much of the existing documentation about latency thresholds describes the 150ms threshold that the International Telecommunication Union – Telecommunication Standardization Sector (ITU-T) defines as acceptable for VoIP

For our reporting purposes, we use the thresholds of < 1% for Packet loss, < 20ms of Jitter and <300ms RTT as our “good”. The RTT is set as 300 as the ITU-T’s 150ms above is one way, not RTT, and SFB reports RTT.

What’s Impacted?

So there is a range between “good” and “bad/poor”, where it’s OK, but not perfect, to not ideal but maybe passable. At Modality, we refer to that as “Impacted”. Some customers on their cooperate network suggest impacted is unacceptable. Others, with more variable networks or less network investment, might consider some impacted, while not ideal, a reality.

In Microsoft’s reports “impacted” is equivalent to yellow highlighted metrics. I have never been able to get an exact range from Microsoft on what triggers “yellow” on SSRS QoE reports or Call Analytics. Essentially anything between “good” and Microsoft’s “Poor” is “impacted”.

What about Mean Opinion Score (MOS)?

Mean Opinion Score (MOS) is an industry-standard to measure voice quality. It is a score out of 5:

Score

Quality

Impairment

Excellent

Imperceptible

Good

Perceptible
but not annoying

Fair

Slightly annoying

Poor

Annoying

Bad

Very annoying

It is a score out of 5, but certain codecs can only reach certain levels. so you can’t consider it a pure network performance score where 5 is good, and 4 is worse. For example, G711 can score up to 4.30, RTAudio Wideband 4.10, but Siren only 3.72 and RTAudio Narrow Band 2.95. So you can only measure MOS relative to the codec used.

Microsoft doesn’t really recommend the use of MOS as a primary measure of Quality:

“ Real MOS measurement relies on individuals to provide their opinion of quality regarding audio clips of standardized lengths. Over the years, computer algorithms and databases have been developed to try to estimate MOS programmatically, based on payload analysis or network metrics analysis. These models are generally very accurate if the test audio samples are also of fixed lengths. Typically, these samples are around eight seconds long. Individuals can generally reach a consistent consensus in evaluating audio quality for short audio samples.

However, if the algorithms are used to calculate MOS for entire calls, the metric starts to deviate from real-world opinions of quality. For example, users experiencing audio distortions in calls might also consider the convenience or novelty of being able to place a call from their mobile application and disregard any actual issues. On the other hand, if the distortions interrupted an important conversation, even for a brief moment, individuals might not be so inclined to dismiss them. In large metrics database systems such as QoE, aggregating MOS using statistical functions such as AVERAGE() or MIN() can distort the view even further.” From The Lync Server Networking Guide from Microsoft

References:
Technet

docs.microsoft.com

From the MVPs: Being a UC Superhero with Lync QoE Superpowers

https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-G.114-200305-I!!PDF-E&type=items

What are Thresholds for Good and Poor Network Packet Loss, Jitter and Round Trip Time for Unified Communications?

What does Microsoft Define as Poor?

What is Good?

What’s Impacted?

What about Mean Opinion Score (MOS)?

Be First to Comment

Leave a ReplyCancel reply