STFT

Short-Term Fourier Transform

Physical Layer →
Introduced in Rel-18

STFT is a signal processing technique that divides a non-stationary signal into short, overlapping segments and computes the Fourier Transform for each to create a time-frequency representation for audio codecs and radio analysis.

Category
Physical Layer
Introduced
Rel-18
Where
Services › Codecs
Specifications
1 specs
STFT Description Purpose Specifications

Description

The Short-Term Fourier Transform (STFT) is a fundamental digital signal processing technique standardized within 3GPP for the analysis of time-varying signals. Unlike the standard Discrete Fourier Transform (DFT) which assumes signal stationarity, STFT is designed for non-stationary signals whose frequency content changes over time, such as speech, audio, or certain radio channel conditions. The core operation involves segmenting the input signal into shorter, often overlapping, time windows or frames. A window function, like a Hamming or Hann window, is applied to each segment to reduce spectral leakage artifacts. The DFT is then computed independently for each windowed segment. The result is a two-dimensional representation: a spectrogram that shows how the frequency spectrum evolves over time.

In 3GPP architectures, particularly for audio and speech codecs defined in specifications like TS 26.253, STFT forms the analytical backbone for transform-domain coding. Codecs like Enhanced Voice Services (EVS) or future immersive audio codecs use STFT to convert time-domain audio samples into a time-frequency domain. Here, psychoacoustic models can be applied to identify perceptually irrelevant components for efficient quantization and compression. The parameters, such as window size, overlap, and transform length, are carefully chosen based on the signal characteristics and the desired trade-off between time resolution and frequency resolution.

For radio access network (RAN) applications, STFT can be utilized in channel sounding, interference analysis, and spectrum sensing. By applying STFT to received baseband signals, engineers can observe how channel impulse responses or interference patterns vary over short time scales, which is vital for adaptive modulation and coding, beamforming, and dynamic spectrum sharing. The implementation within network equipment involves optimized algorithms, often using the Fast Fourier Transform (FFT), to meet real-time processing constraints. Its role is foundational for enabling high-quality, efficient multimedia services and sophisticated radio resource management in 5G-Advanced and beyond.

Purpose & Motivation

STFT was introduced into 3GPP standards to address the fundamental limitation of traditional Fourier analysis when applied to real-world communication signals. Signals like human speech, music, and time-varying radio channels are non-stationary; their statistical properties change over time. A full-signal DFT provides only an average frequency representation, obliterating all temporal information about when specific frequency components occur. This is inadequate for tasks like perceptual audio coding, where identifying transient events (e.g., a drum hit) versus sustained tones is critical for compression efficiency and quality.

Prior to its formal inclusion, codec designs might have used proprietary or less optimal time-frequency transformations. Standardizing the use of STFT, particularly from Release 18 onwards, provides a common, efficient mathematical framework for next-generation audio and speech codecs. It enables more advanced features like bandwidth extension, noise suppression, and immersive audio object coding by offering a precise, manipulable time-frequency grid. For radio systems, it provides a tool to move beyond static channel models, allowing the network to adapt to rapid fading and interference changes, which is essential for ultra-reliable low-latency communication (URLLC) and high-frequency bands with pronounced Doppler effects.

Evolution Across Releases

Rel-18 Initial

Initially introduced in 3GPP specification TS 26.253. It defined the standardized application of STFT for advanced audio coding, establishing parameters like window shapes, sizes, and overlap factors for new codec profiles. This provided a unified signal processing foundation for immersive and enhanced media services.

Explore further

Broader topics and technologies where STFT plays a role.

Defining Specifications

3GPP specifications that define or reference STFT, with the latest known release. Sourced from the 3GPP document catalog — see methodology.

SpecificationTitleRelease
TS 26.253 vj00 IVAS Codec Algorithmic Description Rel-19