What is LP? Linear Prediction

Description

Linear Prediction (LP) is a mathematical operation where a future value of a discrete-time signal is estimated as a weighted sum (linear combination) of its past values. In the context of speech coding, the speech signal s(n) is modeled as the output of an all-pole filter (the synthesis filter) excited by an input signal e(n), which is either a periodic pulse train (for voiced sounds) or white noise (for unvoiced sounds). The relationship is expressed as s(n) = Σ (a_k * s(n-k)) + G * e(n), for k=1 to p, where 'a_k' are the Linear Prediction Coefficients (LPCs), 'p' is the prediction order, and G is a gain factor. The coefficients 'a_k' define the spectral envelope or formants of the speech.

The encoding process involves analyzing a short frame of speech (e.g., 20 ms) to compute the set of LPCs that best predict the signal within that frame. This is typically done by solving the Yule-Walker equations using methods like the Levinson-Durbin recursion. The difference between the original speech signal and the predicted signal is the LP residual, which represents the excitation. High-quality codecs then further encode this residual. For very low bitrates, the residual may be modeled parametrically (as in Algebraic Code-Excited Linear Prediction, ACELP), where only the type of excitation (voiced/unvoiced), pitch period, and a fixed codebook index are transmitted.

At the decoder, the received LPCs are used to construct the synthesis filter. The received excitation parameters are used to generate the excitation signal e(n), which is then passed through this synthesis filter to reconstruct the speech waveform. The stability of the synthesis filter is ensured by converting the LPCs to a more robust representation like Line Spectral Pairs (LSPs) or Immittance Spectral Fairs (ISFs) for quantization and transmission. LP is computationally intensive but provides extremely high compression efficiency, making it the backbone of all modern narrowband and wideband speech codecs standardized by 3GPP, such as the Adaptive Multi-Rate (AMR) and Enhanced Voice Services (EVS) codecs.

Purpose & Motivation

Linear Prediction exists to solve the problem of efficiently digitizing and compressing speech signals for transmission over bandwidth-constrained wireless channels. Its primary purpose is to exploit the strong short-term correlation (redundancy) present in speech signals, where each sample is highly predictable from preceding samples due to the physical properties of the human vocal tract. By modeling this correlation, LP allows the codec to transmit only the model parameters (LPCs) and a simplified representation of the excitation, achieving high compression ratios (e.g., from 64 kbps PCM down to 5.9 kbps AMR) while maintaining intelligible speech quality.

Historically, LP-based coding replaced simpler waveform codecs (like PCM and ADPCM) for mobile voice because it offered far better compression, which was critical for early digital cellular systems (2G GSM) with limited spectral efficiency. The introduction of the Full-Rate speech codec in GSM, based on Regular Pulse Excitation-Long Term Prediction (RPE-LTP), marked the adoption of LP principles. Subsequent evolution through the AMR codec in 3G UMTS and the EVS codec in 4G/5G has continuously refined LP techniques, increasing prediction order, improving quantization of LPCs (using LSPs), and enhancing excitation modeling (e.g., with ACELP and TCX). This evolution has addressed limitations like poor music handling and robotic voice artifacts, extending LP's utility from narrowband telephony to high-quality fullband audio and voice-over-LTE (VoLTE) services.

Evolution Across Releases

Rel-8 Initial

Introduced with the standardization of the AMR Wideband (AMR-WB) codec for GSM and UMTS, extending Linear Prediction techniques to wideband speech (50-7000 Hz). The initial architecture for wideband used a higher LP order (typically 16 vs. 10 for narrowband) to model the broader spectrum and introduced more advanced quantization methods for the LP parameters, significantly improving naturalness and voice quality compared to narrowband telephony.

TS 26.090 TS 26.092 TS 26.190 TS 26.192 TS 26.226 TS 26.253 TS 26.267 TS 26.290 TS 26.818 TS 46.060 TS 46.062

Explore further

Broader topics and technologies where LP plays a role.

Topics

SON (Self-Organizing Networks)LTE / LTE-Advanced Lawful Intercept UMTS / WCDMA Services & Applications Radio Access Network

Technologies

LTE 5G

Defining Specifications

3GPP specifications that define or reference LP, with the latest known release. Sourced from the 3GPP document catalog — see methodology.

Specification	Title	Release
TS 26.090 vj00	AMR Speech Codec Detailed Mapping Specification	Rel-19
TS 26.092 vj00	AMR Comfort Noise for SCR Operation	Rel-19
TS 26.190 vj00	AMR-WB Speech Codec Detailed Mapping	Rel-19
TS 26.192 vj00	AMR-WB Comfort Noise Requirements	Rel-19
TS 26.226 vj00	Cellular Text Telephone Modem (CTM)	Rel-19
TS 26.253 vj00	IVAS Codec Algorithmic Description	Rel-19
TS 26.267 vj00	eCall In-band Modem Specification	Rel-19
TS 26.290 vj00	AMR-WB+ Audio Codec Specification	Rel-19
TS 26.818 vf00	Audio Media Profiles Test Results for VR Streaming	Rel-15
TS 46.060 vj00	GSM Enhanced Full Rate Speech Codec	Rel-19
TS 46.062 vj00	GSM EFR DTX Comfort Noise Specification	Rel-19