LP

Linear Prediction

Services →
Introduced in Rel-8 Also in: User Equipment

LP is a fundamental digital signal processing technique used in speech codecs to model a signal's spectral envelope by predicting a sample's value as a linear combination of past samples for efficient compression.

Category
Services
Introduced
Rel-8
Where
Services › Codecs
Also touches
1 segments
Specifications
11 specs
LP Description Purpose Specifications

Description

Linear Prediction (LP) is a mathematical operation where a future value of a discrete-time signal is estimated as a weighted sum (linear combination) of its past values. In the context of speech coding, the speech signal s(n) is modeled as the output of an all-pole filter (the synthesis filter) excited by an input signal e(n), which is either a periodic pulse train (for voiced sounds) or white noise (for unvoiced sounds). The relationship is expressed as s(n) = Σ (a_k * s(n-k)) + G * e(n), for k=1 to p, where 'a_k' are the Linear Prediction Coefficients (LPCs), 'p' is the prediction order, and G is a gain factor. The coefficients 'a_k' define the spectral envelope or formants of the speech.

The encoding process involves analyzing a short frame of speech (e.g., 20 ms) to compute the set of LPCs that best predict the signal within that frame. This is typically done by solving the Yule-Walker equations using methods like the Levinson-Durbin recursion. The difference between the original speech signal and the predicted signal is the LP residual, which represents the excitation. High-quality codecs then further encode this residual. For very low bitrates, the residual may be modeled parametrically (as in Algebraic Code-Excited Linear Prediction, ACELP), where only the type of excitation (voiced/unvoiced), pitch period, and a fixed codebook index are transmitted.

At the decoder, the received LPCs are used to construct the synthesis filter. The received excitation parameters are used to generate the excitation signal e(n), which is then passed through this synthesis filter to reconstruct the speech waveform. The stability of the synthesis filter is ensured by converting the LPCs to a more robust representation like Line Spectral Pairs (LSPs) or Immittance Spectral Fairs (ISFs) for quantization and transmission. LP is computationally intensive but provides extremely high compression efficiency, making it the backbone of all modern narrowband and wideband speech codecs standardized by 3GPP, such as the Adaptive Multi-Rate (AMR) and Enhanced Voice Services (EVS) codecs.

Purpose & Motivation

Linear Prediction exists to solve the problem of efficiently digitizing and compressing speech signals for transmission over bandwidth-constrained wireless channels. Its primary purpose is to exploit the strong short-term correlation (redundancy) present in speech signals, where each sample is highly predictable from preceding samples due to the physical properties of the human vocal tract. By modeling this correlation, LP allows the codec to transmit only the model parameters (LPCs) and a simplified representation of the excitation, achieving high compression ratios (e.g., from 64 kbps PCM down to 5.9 kbps AMR) while maintaining intelligible speech quality.

Historically, LP-based coding replaced simpler waveform codecs (like PCM and ADPCM) for mobile voice because it offered far better compression, which was critical for early digital cellular systems (2G GSM) with limited spectral efficiency. The introduction of the Full-Rate speech codec in GSM, based on Regular Pulse Excitation-Long Term Prediction (RPE-LTP), marked the adoption of LP principles. Subsequent evolution through the AMR codec in 3G UMTS and the EVS codec in 4G/5G has continuously refined LP techniques, increasing prediction order, improving quantization of LPCs (using LSPs), and enhancing excitation modeling (e.g., with ACELP and TCX). This evolution has addressed limitations like poor music handling and robotic voice artifacts, extending LP's utility from narrowband telephony to high-quality fullband audio and voice-over-LTE (VoLTE) services.

Evolution Across Releases

Rel-8 Initial

Introduced with the standardization of the AMR Wideband (AMR-WB) codec for GSM and UMTS, extending Linear Prediction techniques to wideband speech (50-7000 Hz). The initial architecture for wideband used a higher LP order (typically 16 vs. 10 for narrowband) to model the broader spectrum and introduced more advanced quantization methods for the LP parameters, significantly improving naturalness and voice quality compared to narrowband telephony.

Explore further

Broader topics and technologies where LP plays a role.

Defining Specifications

3GPP specifications that define or reference LP, with the latest known release. Sourced from the 3GPP document catalog — see methodology.

SpecificationTitleRelease
TS 26.090 vj00 AMR Speech Codec Detailed Mapping Specification Rel-19
TS 26.092 vj00 AMR Comfort Noise for SCR Operation Rel-19
TS 26.190 vj00 AMR-WB Speech Codec Detailed Mapping Rel-19
TS 26.192 vj00 AMR-WB Comfort Noise Requirements Rel-19
TS 26.226 vj00 Cellular Text Telephone Modem (CTM) Rel-19
TS 26.253 vj00 IVAS Codec Algorithmic Description Rel-19
TS 26.267 vj00 eCall In-band Modem Specification Rel-19
TS 26.290 vj00 AMR-WB+ Audio Codec Specification Rel-19
TS 26.818 vf00 Audio Media Profiles Test Results for VR Streaming Rel-15
TS 46.060 vj00 GSM Enhanced Full Rate Speech Codec Rel-19
TS 46.062 vj00 GSM EFR DTX Comfort Noise Specification Rel-19