Chapter 9: Network Requirements and Preparation Understanding the Requirements for Toll-Quality Voice
ShoreTel 8 Planning and Installation Guide 105
Latency is the amount of time it takes for one person’s voice to be sampled, digitized (or
encoded), packetized, sent over the IP network, de-packetized, and replayed to another
person. This one-way delay, from “mouth-to-ear,” must not exceed 100 msecs for toll-
quality voice, or 150 msecs for acceptable-quality voice. If the latency is too high, it
interferes with the natural flow of the conversation, causing the two parties to confuse the
latency for pauses in speech. The resulting conversation is reminiscent of international
calls over satellite facilities.
The latency introduced by the ShoreTel 8 system can be understood as follows: When a
person talks, the voice is sampled by the ShoreGear voice switch, generating a latency of 5
msecs. If the call does not traverse ShoreTel voice switches and is handled completely
internally by the switch, the latency is generated by the basic internal pipeline of the
switch. In this case, the switch samples the voice, processes it, combines it with other voice
streams (switchboard), and then converts it back to audio for output to the phone in
5-msec packets, for a total latency of about 17 msecs.
When the call transfers between voice switches, the voice is packetized in larger packets—
10-msec for LAN and 20-msec for WAN—to reduce network overhead. The larger packets
take more time to accumulate and convert to RTP before being sent out. On the receive
side, the incoming packets are decoded and placed in the queue for the switchboard. For a
10-msec packet, this additional send/receive time is approximately 15 msecs, and for a 20-
msec packet it is about 25 msecs.
For IP phones, the latency is 20 ms in the LAN and 30ms in the WAN.
When the codec is G.729a, the encoding process takes an additional 10 msecs and the
decoding process can take an additional 10 msecs.
See Table 9-6 for specific information about latency on the ShoreTel 8 system.
256 Kbps 128 Kbps 64 Kbps 32 Kbps 8 Kbps 8 Kbps
284 Kbps 146 Kbps 82 Kbps 52 Kbps 26 Kbps 26 Kbps
260 Kbps 132 Kbps 68 Kbps 37 Kbps 12 Kbps 12 Kbps
a. When ADPCM voice encoding is used, an additional 4 bytes are added to the voice
data for decoding purposes.
b. Voice data bytes per packet = (# bits/sample) x (8 samples/msec) x (20 msecs/packet)
/ (8 bits/byte)
c. Bandwidth = (# bytes/20 msecs) x (8 bits/byte)
Configuration Overhead Encoding Frame Size -5 Jitter BufferaDecoding Total (+/– 5 msec)b
Switch 17 0 0 Varies 0 17
Table 9-6 Latency
Broadband Linear G.711 ADPCM G.729a G.729a
Table 9-5 WAN Bandwidth—Bytes