Description
Next Generation Audio (NGA) is a comprehensive framework within 3GPP standards that defines advanced audio coding and processing technologies for multimedia telecommunication services. Its cornerstone is the Enhanced Voice Services (EVS) codec, a super-wideband and full-band codec that provides significantly improved voice quality and robustness compared to prior codecs like AMR-WB. NGA is designed to deliver high-quality audio experiences across a wide range of bitrates and network conditions, from robust narrowband operation for coverage to high-bitrate full-band stereo for music.
NGA works by employing sophisticated audio compression algorithms. The EVS codec, for instance, uses a hybrid approach combining both Code-Excited Linear Prediction (CELP) for efficient speech coding and Modified Discrete Cosine Transform (MDCT) for high-quality music and general audio coding. This allows it to seamlessly switch between coding modes based on the input signal (speech or audio). The framework also includes capabilities for immersive audio, such as support for multi-channel audio (e.g., 5.1, 7.1) and object-based audio scenes, enabling more realistic soundscapes in services like teleconferencing or augmented reality.
Architecturally, NGA technologies are integrated into the media processing functions of the network, such as the Media Resource Function (MRF) in the IMS core, and are implemented in end-user devices. Key components include the codec itself, jitter buffer management for packet loss concealment, and error resilience mechanisms. Its role is to provide a future-proof audio foundation for services ranging from VoLTE and VoNR voice calls to streaming media, video conferencing, and broadcast services, ensuring optimal quality of experience (QoE) over packet-switched networks.
Purpose & Motivation
NGA was created to address the growing demand for high-quality, immersive audio experiences over mobile networks, which was not fully met by legacy codecs like AMR-NB and AMR-WB. The primary motivation was the evolution of voice services from traditional telephony to HD Voice (VoLTE) and beyond, requiring a codec that could deliver crystal-clear speech even in challenging network conditions. Furthermore, the rise of music streaming, video calling, and multimedia services demanded a single, efficient codec capable of handling both speech and general audio with high fidelity.
It solves the problem of fragmented audio experiences by providing a unified, high-performance audio solution. Previous approaches required different codecs for speech and music, leading to complexity and potential quality degradation during transitions. NGA, through EVS, offers a single codec that excels at both, with superior noise robustness and packet loss concealment. Its development was also driven by the need for bandwidth efficiency, allowing operators to deliver superior quality without proportionally increasing network load, and by the emerging requirements for immersive and interactive audio in 5G-era services like augmented reality communication.
Key Features
- Enhanced Voice Services (EVS) codec for super-wideband and full-band audio
- Hybrid coding (CELP/MDCT) for optimal speech and music performance
- Advanced packet loss concealment and error resilience mechanisms
- Support for immersive audio formats and multi-channel configurations
- Wide operational bitrate range from 5.9 kbps to 128 kbps
- Backward compatibility with legacy AMR-WB for interoperability
Evolution Across Releases
Introduced the Next Generation Audio framework, primarily centered on the EVS codec. Established the core specifications for high-quality speech and audio services, including the codec algorithm, test sequences, and integration guidelines for VoLTE and other multimedia services.
Defining Specifications
| Specification | Title |
|---|---|
| TS 26.917 | 3GPP TS 26.917 |
| TS 26.949 | 3GPP TS 26.949 |