Description
Source Controlled Rate (SCR) operation is a fundamental feature of speech codecs used in mobile communication systems, most notably the Adaptive Multi-Rate (AMR) and AMR-Wideband codecs. It is a codec mode of operation where the encoding rate is dynamically controlled based on the presence or absence of active speech from the source (the speaker). During active speech segments, the codec operates at its full, designated bit rate to maintain high voice quality. However, during periods of silence or background noise—which statistically constitute over 50% of a typical conversation—the codec switches to a significantly lower bit rate mode. In this low-rate mode, the encoder does not transmit full speech frames but instead transmits Silence Descriptor (SID) frames at periodic intervals. These SID frames contain parameters that allow the receiver's decoder to generate comfort noise, which is artificial background noise that mimics the acoustic characteristics of the caller's environment, preventing an unnerving 'dead silence' for the listener.
Architecturally, SCR involves close coordination between the voice activity detector (VAD) and the speech codec within the mobile terminal or network transcoder. The VAD analyzes the input audio signal to classify each frame as containing active speech or not. This classification is passed to the codec, which then selects the appropriate encoding algorithm and frame type. The system also includes a comfort noise generator (CNG) at the receiving end, which uses the parameters in the SID frames to synthesize noise. The operation is managed by in-band signaling within the speech frame structure itself, indicating whether a frame is a speech frame, a SID frame, or a marker for the end of a silence period.
SCR's role in the network is primarily one of efficiency. By drastically reducing the average bit rate of a voice call, it decreases the load on the radio interface. This translates directly into increased system capacity, as more users can be supported on the same radio resources. For the user, it provides the significant benefit of extended battery life in the mobile terminal, as the transmitter can be powered down or operate at a lower power level during silence periods. It is a core component of the traffic adaptation and power saving mechanisms defined in 3GPP specifications, making voice services more spectrally efficient and user-friendly.
Purpose & Motivation
SCR was developed to address the inherent inefficiency of transmitting constant bit rate audio during a voice conversation, where significant portions consist of silence or low-activity noise. Early digital voice systems transmitted a continuous stream of bits regardless of speech activity, wasting precious radio spectrum and draining mobile phone batteries unnecessarily. The motivation for SCR was twofold: to increase the capacity of cellular networks—a critical commercial driver—and to improve the talk time of handsets, a key user experience factor.
The technology solves the problem of resource waste during speech pauses. By introducing a discontinuous transmission (DTX) mechanism controlled by the source signal itself, it allows the network to re-allocate the freed-up radio resources (timeslots, codes, or bandwidth) to other users. This was particularly important for GSM and its evolution, where spectrum was a limited and expensive commodity. SCR, combined with the AMR codec's adaptability, became a cornerstone for providing high-quality voice service while optimizing the use of the entire system. It represented a shift from treating voice as a constant bit-rate stream to treating it as a variable-rate service, adapting to the actual information content.
Key Features
- Dynamically switches codec rate based on voice activity detection (VAD)
- Transmits low-bit-rate Silence Descriptor (SID) frames during silence
- Enables generation of comfort noise at the receiver to avoid dead silence
- Significantly reduces average bit rate and radio resource consumption
- Extends mobile terminal battery life by allowing transmitter power-down
- Integral part of AMR and AMR-WB speech codec specifications
Evolution Across Releases
Source Controlled Rate operation was standardized as a core feature of the Adaptive Multi-Rate (AMR) speech codec. The initial architecture defined the Voice Activity Detector (VAD), algorithms for generating Silence Descriptor (SID) frames, and the comfort noise generation procedures, establishing the framework for efficient discontinuous transmission in GSM and UMTS voice services.
Defining Specifications
| Specification | Title |
|---|---|
| TS 26.071 | 3GPP TS 26.071 |
| TS 26.091 | 3GPP TS 26.091 |
| TS 26.092 | 3GPP TS 26.092 |
| TS 26.093 | 3GPP TS 26.093 |
| TS 26.101 | 3GPP TS 26.101 |
| TS 26.102 | 3GPP TS 26.102 |
| TS 26.103 | 3GPP TS 26.103 |
| TS 26.171 | 3GPP TS 26.171 |
| TS 26.191 | 3GPP TS 26.191 |
| TS 26.192 | 3GPP TS 26.192 |
| TS 26.193 | 3GPP TS 26.193 |
| TS 26.201 | 3GPP TS 26.201 |
| TS 26.202 | 3GPP TS 26.202 |
| TS 26.454 | 3GPP TS 26.454 |
| TS 28.062 | 3GPP TS 28.062 |