AAC-ELD

Advanced Audio Coding – Enhanced Low Delay

Services
Introduced in Rel-13
AAC-ELD is a low-latency audio codec standardized by 3GPP for real-time communication services. It combines the high compression efficiency of AAC-LD with spectral band replication (SBR) to deliver CD-quality audio with minimal delay, making it essential for voice and video calls, gaming, and live streaming over mobile networks.

Description

Advanced Audio Coding – Enhanced Low Delay (AAC-ELD) is a sophisticated audio coding technology standardized in 3GPP TS 26.923 that addresses the critical need for high-quality, low-latency audio transmission in mobile communication systems. The codec operates by employing a hybrid coding approach that combines the core AAC-LD (Low Delay) transform coding with spectral band replication (SBR) and parametric stereo (PS) techniques. This architecture enables efficient compression while maintaining audio fidelity and minimizing algorithmic delay to approximately 15-32 milliseconds, which is crucial for real-time bidirectional communication where end-to-end delays must remain below 150 milliseconds to avoid perceptible echo and conversation disruption.

At its technical core, AAC-ELD utilizes a modified discrete cosine transform (MDCT) with a frame length optimized for low latency, typically 512 or 480 samples at 48 kHz sampling rate. The codec incorporates several advanced components including a psychoacoustic model that analyzes audio signals to determine masking thresholds, allowing it to allocate bits efficiently by removing perceptually irrelevant information. The SBR component extends the high-frequency bandwidth by transmitting only a small amount of side information that enables the decoder to reconstruct high-frequency content from the lower frequency bands, significantly improving compression efficiency without increasing delay.

The codec's operation involves multiple processing stages: time-frequency transformation through MDCT, quantization based on psychoacoustic modeling, Huffman coding for entropy reduction, and SBR/PS parameter extraction. For stereo content, AAC-ELD can operate in either true stereo mode or parametric stereo mode where spatial cues are encoded as compact parameters rather than full channel information. The decoder performs inverse operations including Huffman decoding, inverse quantization, inverse MDCT transformation, and SBR synthesis to reconstruct the audio signal. This comprehensive approach allows AAC-ELD to deliver near-transparent audio quality at bitrates ranging from 24 to 64 kbps while maintaining the low latency necessary for conversational applications.

Within 3GPP networks, AAC-ELD serves as a key enabler for Enhanced Voice Services (EVS) and other real-time multimedia applications. It integrates with the IMS (IP Multimedia Subsystem) architecture through defined codec negotiation procedures in SIP/SDP signaling, allowing endpoints to select the optimal audio codec based on network conditions and device capabilities. The codec's robust error concealment mechanisms and packet loss resilience make it particularly suitable for wireless transmission where packet loss and jitter are common, ensuring consistent audio quality even under challenging network conditions.

Purpose & Motivation

AAC-ELD was developed to address the growing demand for high-quality, low-latency audio communication in mobile networks, particularly for services like Voice over LTE (VoLTE), video conferencing, and interactive multimedia applications. Prior to its introduction, mobile audio codecs faced a fundamental trade-off between compression efficiency, audio quality, and algorithmic delay. Traditional codecs like AMR-WB provided reasonable latency but limited audio bandwidth, while high-efficiency codecs like HE-AAC introduced unacceptable delays for conversational applications. This limitation became increasingly problematic as mobile networks evolved to support richer multimedia services that required both high fidelity and real-time interaction.

The creation of AAC-ELD was motivated by several specific limitations in existing approaches. Standard AAC codecs, while offering excellent compression efficiency, typically introduced algorithmic delays of 100-200 milliseconds due to their use of long transform windows and look-ahead buffers. This made them unsuitable for two-way communication where total end-to-end delay must remain below 150 milliseconds to maintain natural conversation flow. Meanwhile, dedicated low-delay codecs like G.722.1 offered minimal latency but suffered from inferior compression efficiency and audio quality compared to modern transform-based codecs.

3GPP recognized that emerging services like high-definition voice, video calling, and real-time gaming required a new audio coding solution that could bridge this gap. AAC-ELD specifically addresses these requirements by combining the transform coding efficiency of AAC with innovative low-delay optimizations and bandwidth extension techniques. Its development was driven by the evolution of mobile networks toward all-IP architectures where voice and multimedia services converge, creating a need for a unified audio codec that could deliver studio-quality audio with conversational latency while efficiently utilizing available network bandwidth.

Key Features

  • Algorithmic delay of 15-32 milliseconds enabling natural conversation
  • Spectral Band Replication (SBR) for efficient high-frequency coding
  • Parametric Stereo (PS) support for spatial audio at low bitrates
  • Bitrate range from 24 to 64 kbps with scalable quality
  • Robust error concealment for packet loss resilience
  • Backward compatibility with AAC-LD decoders

Evolution Across Releases

Rel-13 Initial

Initial standardization of AAC-ELD in 3GPP TS 26.923 with comprehensive specification of the codec architecture, including the hybrid AAC-LD+SBR framework, parametric stereo capabilities, and detailed algorithmic descriptions. The release defined the complete encoder and decoder operations, psychoacoustic model parameters, and integration requirements for mobile communication systems, establishing AAC-ELD as a key component for Enhanced Voice Services (EVS) and real-time multimedia applications.

Defining Specifications

SpecificationTitle
TS 26.923 3GPP TS 26.923