Description
Multipulse Maximum Likelihood Quantization (MP-MLQ) is a speech coding algorithm belonging to the Code-Excited Linear Prediction (CELP) family. It is standardized in 3GPP TS 26.110 for the Adaptive Multi-Rate Wideband (AMR-WB) speech codec, where it operates as one of the possible modes for higher bitrates. The core principle of CELP is to model the vocal tract using a linear predictive (LP) filter and to represent the residual excitation signal using a codebook. MP-MLQ specifically refines the excitation modeling. In this algorithm, the excitation is represented by a sequence of pulses (multipulse) with variable positions and amplitudes. The 'Maximum Likelihood Quantization' part refers to the sophisticated method used to select the optimal combination of pulse positions and gains that, when passed through the LP synthesis filter, produces synthesized speech that is closest (in a perceptual sense) to the original input speech.
The encoding process involves several stages. First, linear prediction analysis is performed on a frame of speech to extract LP filter coefficients, which are quantized and transmitted. The LP filter is then used to create a weighted error criterion, emphasizing perceptually important frequencies. The encoder searches through a structured multipulse codebook to find the excitation sequence that minimizes this perceptually weighted error between the original and synthesized speech. This search is computationally intensive but yields high-quality synthesis. The parameters describing the selected pulse positions and amplitudes are quantized and sent to the decoder along with the LP parameters. The decoder uses these received parameters to reconstruct the excitation signal and pass it through the LP synthesis filter to reproduce the speech signal.
Within the AMR-WB codec, MP-MLQ is employed for the higher bit-rate modes (e.g., 23.85 kbit/s). AMR-WB itself provides a range of bitrates for adapting to channel conditions, and MP-MLQ provides the highest fidelity within that range. The codec operates on 20 ms speech frames. The use of MP-MLQ allows for very natural and clear speech quality, approaching wireline quality, which is crucial for the wideband speech experience (50-7000 Hz bandwidth). This makes it suitable for Voice over LTE (VoLTE) and other high-quality voice services in 4G and 5G networks. The complexity of the MP-MLQ search algorithm means it requires more processing power than simpler algebraic codebook methods (like ACELP used in lower AMR-WB modes), but this trade-off is acceptable for fixed or high-capacity mobile scenarios where quality is paramount.
Purpose & Motivation
MP-MLQ was developed to push the boundaries of speech quality within the constraints of digital mobile communication channels. Earlier speech codecs like the regular AMR (narrowband) provided good quality but within a limited bandwidth (300-3400 Hz). The drive for more natural, 'face-to-face' voice quality led to wideband speech (50-7000 Hz). However, encoding wider bandwidth audio requires more bits or more efficient algorithms. MP-MLQ addresses this need for efficiency at higher bitrates.
It solves the problem of accurately representing the complex excitation signal in a CELP model. Simpler stochastic or algebraic codebooks (ACELP) are efficient but can introduce artifacts at lower bitrates. For the high-quality modes of AMR-WB, MP-MLQ provides a more flexible and accurate representation of the excitation, leading to cleaner synthesized speech with less noise and distortion. Its creation was motivated by the desire to offer a scalable codec (AMR-WB) that could provide a clear quality advantage for VoLTE and other IP-based voice services, helping them compete with traditional circuit-switched voice and become a preferred service for users. MP-MLQ, combined with the wider bandwidth, was a key enabler of the HD Voice experience in mobile networks.
Key Features
- Multipulse representation of the excitation signal in a CELP codec
- Uses Maximum Likelihood Quantization for optimal pulse parameter selection
- Provides high-fidelity wideband speech (50-7000 Hz)
- Used in high bit-rate modes of the AMR-WB codec (e.g., 23.85 kbit/s)
- Perceptually weighted error minimization during encoding
- Higher computational complexity compared to algebraic codebook methods
Evolution Across Releases
MP-MLQ was incorporated into 3GPP specifications in Release 8 as part of the formal standardization of the AMR-WB codec for EPS/LTE networks (TS 26.110). Its inclusion provided a high-quality speech coding option for VoLTE, enabling wideband HD voice services as a fundamental feature of the LTE packet-switched voice solution.
Defining Specifications
| Specification | Title |
|---|---|
| TS 26.110 | 3GPP TS 26.110 |