Description
Digital Speech Interpolation (DSI) operates on the principle that a typical two-way telephone conversation contains significant periods of silence, such as pauses between sentences and listening time. The system uses voice activity detection (VAD) to distinguish between active speech and silence or background noise. During active speech, the digital speech samples are packetized and transmitted over the channel. During silent periods, the channel is not allocated to that conversation, freeing up bandwidth for other users.
The core architecture involves a speech detector, a buffer, and a control unit. The speech detector analyzes the incoming signal to identify talk spurts. These spurts are then placed into packets with appropriate addressing information. A key component is the assignment logic, which dynamically assigns available channel slots to active speakers from a pool of users. This statistical multiplexing allows the number of supported users to exceed the number of physical transmission channels.
In the 3GPP context, DSI is referenced in specifications like 21.905 (Vocabulary) and 43.050 (Transmission planning aspects of the speech service in the GERAN). Its role is primarily in optimizing the use of transmission resources in the core network and backhaul links, particularly for circuit-switched voice services. It improves the trunking efficiency between network nodes, reducing the required number of physical E1/T1 lines and associated costs.
Purpose & Motivation
DSI was created to address the high cost and limited availability of transmission lines, especially in long-distance and international telephony. Before techniques like DSI, each voice call required a dedicated 64 kbps channel (DS0) for its entire duration, regardless of actual speech activity. This was highly inefficient, as a large portion of the bandwidth was wasted during silent periods.
The historical context lies in the transition from analog to digital telephony, where maximizing the return on investment for expensive digital transmission infrastructure became paramount. DSI solved this by applying statistical multiplexing to voice traffic, allowing carriers to serve more subscribers with the same physical resources. It directly addressed the limitation of fixed channel allocation, enabling significant cost savings and increased capacity without degrading perceived voice quality for the end user.
Key Features
- Voice Activity Detection (VAD) to distinguish speech from silence
- Dynamic bandwidth allocation based on speech activity
- Statistical multiplexing of multiple voice channels
- Packetization of active speech spurts
- Increased trunking efficiency for transmission links
- Transparent operation to end-users without perceived quality loss
Evolution Across Releases
Introduced as a defined term and concept within 3GPP standards for GSM/UMTS networks. Specified as a transmission planning technique in GERAN to optimize the utilization of A-interface and other core network transmission links. The initial architecture focused on circuit-switched voice traffic efficiency.
Defining Specifications
| Specification | Title |
|---|---|
| TS 21.905 | 3GPP TS 21.905 |
| TS 43.050 | 3GPP TR 43.050 |