HOA

Higher Order Ambisonics

Services →
Introduced in Rel-14

HOA is an advanced spatial audio format standardized by 3GPP for capturing and reproducing a full 360-degree sound field to enable immersive, object-based audio experiences over mobile networks.

Category
Services
Introduced
Rel-14
Where
Services › Codecs
Specifications
8 specs
HOA Description Purpose Related Classification Specifications

Description

Higher Order Ambisonics (HOA) is a parametric representation of a sound field, designed to capture and reproduce immersive, three-dimensional audio. Unlike channel-based formats (e.g., 5.1 or 7.1 surround) which assign audio to specific speaker locations, Ambisonics represents the sound field as a set of spherical harmonic coefficients. These coefficients mathematically describe how sound pressure varies across all directions around a point in space. The 'order' (e.g., 1st, 2nd, 3rd) determines the spatial resolution and accuracy; higher orders provide more precise directional and spatial information, such as the width and elevation of sound sources.

Technically, HOA audio is created by encoding signals from a microphone array (or synthesized from audio objects) into Ambisonics B-format channels. Each channel corresponds to a specific spherical harmonic component (e.g., W for omnidirectional, X/Y/Z for first-order figure-of-eight patterns). For Nth order Ambisonics, there are (N+1)² channels. This encoded signal is a scene-based representation, meaning it is independent of any specific playback system. For rendering, the HOA stream is decoded based on the target speaker layout or binaurally for headphones, using a set of decoding coefficients that project the spherical harmonics onto the available output transducers.

Within the 3GPP ecosystem, HOA is integrated into media delivery standards, particularly for Virtual Reality (VR), Augmented Reality (AR), and immersive teleconferencing. Key specifications define the transport and storage of HOA content, such as its encapsulation in the ISO Base Media File Format (ISOBMFF) for Dynamic Adaptive Streaming over HTTP (DASH). Codecs like MPEG-H 3D Audio support HOA as an input format for compression and transmission. The network's role is to deliver these potentially high-bitrate, multi-channel audio streams efficiently, often in synchronization with 360-degree video, requiring robust QoS and media-aware network functions.

Purpose & Motivation

HOA was standardized in 3GPP to address the limitations of traditional audio formats for emerging immersive media applications. Channel-based surround sound is tied to a fixed, predefined speaker configuration and does not adequately support 360-degree listener rotation, which is essential for VR/AR. First-order Ambisonics (FOA) offers basic 3D audio but with limited spatial resolution and accuracy, often resulting in blurred or imprecise sound source localization. HOA was introduced to solve these problems, providing the high-fidelity, full-sphere audio necessary for convincing presence and realism in virtual environments.

The driving motivation was the commercial rise of VR and 360-degree video services, which demanded an audio format that could match the visual immersion. HOA enables sound to remain stable and accurately positioned relative to the visual scene as the user rotates their head, which is critical for maintaining the illusion of being 'inside' the content. From a network and service perspective, standardizing HOA ensures interoperability between content creation tools, compression codecs, streaming servers, and playback devices, preventing vendor lock-in and fostering a healthy ecosystem for immersive media.

Furthermore, HOA's scene-based nature is more efficient for interactive and adaptive streaming compared to transmitting multiple discrete object tracks. A single HOA stream can represent a complex auditory scene, and the rendering can be adapted client-side based on user interaction (head movement) or device capabilities (different headphone types), without requiring the server to re-encode or send multiple audio streams. This reduces server complexity and network load, making it a scalable solution for delivering personalized immersive audio over mobile networks.

Classification

Part ofDASH
Specific typesHOA2SN3D

Evolution Across Releases

Rel-14 Initial

Initially standardized for immersive media services. Specifications defined the core HOA format, its encapsulation for streaming, and support for first-order and higher-order content within VR and audio-on-demand applications, establishing the foundation for 3D audio delivery.

Explore further

Broader topics and technologies where HOA plays a role.

Defining Specifications

3GPP specifications that define or reference HOA, with the latest known release. Sourced from the 3GPP document catalog — see methodology.

SpecificationTitleRelease
TS 26.118 vj00 Virtual Reality Media Formats Rel-19
TS 26.253 vj00 IVAS Codec Algorithmic Description Rel-19
TR 26.805 vh01 Study on Media Production over 5G NPN Systems Rel-17
TS 26.818 vf00 Audio Media Profiles Test Results for VR Streaming Rel-15
TR 26.865 vi00 Technical Report Rel-18
TR 26.918 vj00 Virtual Reality Relevance Study for 3GPP Rel-19
TR 26.933 vj00 Study on Diverse Audio Capturing System Rel-19
TR 26.998 vj00 5G AR/MR Glasses Integration Study Rel-19