Description
Scene-Based Audio (SBA), specifically based on the Ambisonics technique, is a full-sphere surround sound format that captures and represents a three-dimensional sound field. Unlike channel-based audio (e.g., 5.1, 7.1.4) which encodes audio for specific speaker positions, or object-based audio (e.g., MPEG-H) which encodes individual sound objects with metadata, SBA encodes the sound field itself as a set of spherical harmonic components. This mathematical representation describes the pressure and velocity of sound waves at a point in space, allowing for the reconstruction of the original sound field over a variety of playback systems, from headphones with binaural rendering to complex speaker arrays.
The core of SBA is the B-format signal, which consists of at least four channels: W (omnidirectional pressure), and X, Y, Z (the three orthogonal figure-of-eight components representing pressure gradients). This first-order Ambisonics (FOA) can be extended to higher-order Ambisonics (HOA) by including more spherical harmonic components, which increases the spatial resolution and accuracy of the reconstructed sound field, particularly for elevated sounds and more precise localization. The 3GPP standardization focuses on efficiently compressing, transporting, and rendering these Ambisonics components within media services, such as streaming for virtual reality (VR), augmented reality (AR), and 360-degree video.
Within the 3GPP architecture, SBA is integrated into the media delivery pipeline. The specifications define how SBA content is encapsulated in media containers (like ISOBMFF), compressed using audio codecs (with specific handling for the spherical harmonic channels), and described in media presentation descriptions. A key aspect is the support for dynamic rendering: the SBA bitstream, containing the sound field coefficients, is delivered to the client device. The device's audio renderer then uses a set of decoding matrices, potentially tailored to the user's specific head orientation (tracked via head-mounted displays) and output setup (headphones or speakers), to binauralize or decode the audio for immersive playback. This allows for six degrees of freedom (6DoF) audio where the listener can move within the sound scene.
3GPP's work on SBA involves multiple technical specifications (TS) covering codecs, file formats, system protocols, and security. It ensures interoperability for immersive audio services across different networks and devices. The specifications also address metadata for coordinating SBA with 360-degree video, ensuring audio-visual synchronization as the user's viewpoint changes. This makes SBA a foundational technology for delivering next-generation, interactive media experiences over 5G and beyond networks.
Purpose & Motivation
Scene-Based Audio (Ambisonics) was standardized by 3GPP to address the growing market for immersive media, particularly driven by virtual and augmented reality. Traditional channel-based audio is tied to fixed speaker configurations and cannot adapt to user head movement or different playback environments. Object-based audio provides flexibility but requires significant metadata and computational power for rendering many objects. SBA was motivated by the need for a format that inherently describes a complete sound scene in a compact, playback-agnostic manner.
The historical context is the rise of 360-degree video and VR content. Early VR experiences often used basic binaural audio or simple multi-channel mixes, which broke immersion when the user turned their head. Ambisonics, a decades-old academic concept, was identified as a suitable solution because it encodes the sound field mathematically. 3GPP's role was to standardize its use in a telecommunications ecosystem, solving the problems of efficient compression for transmission over bandwidth-constrained mobile networks and defining how clients receive and render the audio in sync with video.
It addresses key limitations of previous audio formats for immersive applications. Channel-based audio lacks adaptability. Object-based audio can become computationally complex for dense scenes. SBA provides a sweet spot: a scene description that is relatively compact, independent of the output setup, and perfectly suited for head-tracked binaural rendering, which is essential for VR. Its standardization enables content creators to produce a single audio stream that works on any compliant device, from mobile phones with headphones to dedicated VR systems, fostering an interoperable ecosystem for immersive 3GPP media services.
Detected Changes Across Releases
from 3GPP Change RequestsSpecific changes extracted from the „Change history“ tables of 3GPP specifications (105 CRs across 6 releases). Complements the general historical overview above with the evidence-based evolution of this function.
Studied in Rel-14, normative work from Rel-15.
In Release 15, the newly introduced SBA (Scene-Based Audio (Ambisonics)) function was clarified in scope and had its priority indication mechanism enhanced. Specifically, the capability for priority indication over SBA interfaces via a Message Priority header was introduced. Furthermore, corrections were made to the sensitivity calculation for immersive audio playback to improve the quality assessment.
- Findings and Conclusions from study on 3GPP codecs for VR audio TS 26.918CR0003
- Clarification for S-NSSAI based congestion Control TS 23.501CR0072
- SBA Scope Clarification TS 23.501CR0157
- SUPI based paging TS 23.501CR0199
- Clarification of S-NSSAI based congestion control TS 23.501CR0295
- Corrections to AF influence (5.6.7) based on CT WG3 LS on AF influence on traffic routing TS 23.501CR0558
+ 3 more changes
In Release 16, the enhancements for Scene-Based Audio (Ambisonics) were delivered through the new SEAL Data Delivery (SEALDD) enabler, which introduced a service-based architecture (SBA) for multi-modal data transmission. This provided SEALDD-enabled multi-modal flow synchronization and optimization specifically for immersive services like XR. The architecture allowed the SEALDD server to act as an Application Function (AF) to consume 5G Core network services over SBA interfaces for enhanced media delivery.
- Introduction of NEF based infrequent small data transfer via NAS TS 23.501CR0890
- CR for TS 23.501 based on conclusion of eNA TR 23.791 TS 23.501CR0831
- NRF based P-CSCF discovery TS 23.501CR1035
- QoS monitoring based on GTP-U paths TS 23.501CR1414
- Addition of General SBA/SBI aspects in TS 33.117 TS 33.117CR0047
- Failure handling for redundancy based on dual connectivity TS 23.501CR1489
+ 20 more changes
In Release 17, the new Scene-Based Audio (Ambisonics) function was introduced as part of the SEAL Data Delivery (SEALDD) enabler for multi-modal services. This function supports SEALDD-enabled multi-modal data transmission, which can coordinate separate audio and video application traffic flows. The architecture for this is based on the service-based representation specified for the SEALDD layer, utilizing the 5G Core's Service-Based Architecture (SBA).
- Network Slice restriction based on NWDAF analytics TS 23.501CR2567
- NWDAF discovery and selection based on provided ML models TS 23.501CR2585
- UP path selection enhancement based on analytics info provided by NWDAF TS 23.501CR2586
- Clarification on UE provides PDU Session Pair ID based on URSP rules TS 23.501CR2736
- Thresholds for Priority-based mode TS 23.501CR2744
- IMSI based SUPI support when access an SNPN using credentials owned by CH TS 23.501CR2919
+ 15 more changes
In Release 18, the enhancements for Scene-Based Audio (SBA) within the SEALDD framework primarily focused on enabling more sophisticated multi-modal flow synchronization and data delivery optimization. This included the introduction of SEALDD-enabled multi-modal data transmission services and the establishment of SEALDD regular data transmission connections based on provisioned Data Delivery (DD) policy. These improvements leveraged the service-based architecture (SBA) for network exposure, allowing the SEALDD server to act as an Application Function (AF) to consume 5G Core network services.
- Adding time synchronization service based on subscription TS 23.501CR3762
- PCF support of 5GS Packet Delay Variation monitoring based on QoS monitoring mechanism and exposed to AF TS 23.501CR3792
- Support of PDU Set based handling TS 23.501CR4046
- CN based MT communication capability indication TS 23.501CR4081
- PDU Set based QoS Handling for uplink transmission TS 23.501CR4744
- Objective Test Methodologies for IVAS-based UEs TS 26.260CR0006
+ 17 more changes
In Release 19, the enhancements for Scene-Based Audio (SBA) within the SEALDD framework introduced a specific application enablement architecture for tethered XR devices based on the PINAPP model. This architecture supports SEALDD-enabled multi-modal data transmission services, which include synchronized audio flows as part of the coordinated application traffic. Furthermore, the release defined procedures for SEALDD-enabled regular data transmission connection establishment based on provisioned Data Delivery (DD) policy to optimize the delivery of such multi-modal application traffic.
- Architecture update to support the tethered UE based on PINAPP TS 23.433CR0087
- XR architecture based on SEALDD architecture TS 23.433CR0108
- Subscription-based routing to a target core network TS 23.501CR5380
- I-SMF selection/insertion based on local offloading allowed indication TS 23.501CR5604
- Support PDU Set information identification based on MoQ for encrypted XRM traffic TS 23.501CR5632
- Support of Slice change based on AF request TS 23.501CR5764
+ 16 more changes
In Release 20, the specification for SBA (Service-Based Architecture) was enhanced to explicitly support SEAL Data Delivery (SEALDD) functions acting as an Application Function (AF). This is detailed through new service-based representations where the SEALDD server consumes 5G Core Network services over SBA interfaces and utilizes Core Network northbound APIs via CAPIF. These architectural updates formally integrate SEALDD's multi-modal and XR application optimization capabilities within the 5GS service-based framework.
- Mitigation actions based on New Abnormal user plane traffic Analytics TS 23.501CR6507
- Rel-20 CR TS 28.541 Enhancement on PreDefinedPccRule for exposure of information related to XRM service based on QoSMonitoring TS 28.541CR1570
- Rel-20 CR TS 28.541 Enhancement on PreDefinedPccRule for PDU set based handling TS 28.541CR1571
- Fix the mistake of PDU Set based handling and PDU Set based QoS handling TS 23.501CR6526
Explore further
Broader topics and technologies where SBA plays a role.
Defining Specifications
3GPP specifications that define or reference SBA, with the latest known release. Sourced from the 3GPP document catalog — see methodology.
| Specification | Title | Release |
|---|---|---|
| TS 23.433 vk00 | SEAL Data Delivery (SEALDD) for Verticals | Rel-20 |
| TS 23.501 vk00 | 5G System Architecture Stage 2 | Rel-20 |
| TS 23.540 vj20 | 5G Service Based SMS Stage 2 | Rel-19 |
| TS 23.700 vk00 | XR Services Application Enablement Layer | Rel-20 |
| TS 24.229 vj50 | IMS call control protocol based on SIP and SDP | Rel-19 |
| TS 26.253 vj00 | IVAS Codec Algorithmic Description | Rel-19 |
| TS 26.255 vj00 | IVAS Frame Loss Concealment Procedure | Rel-19 |
| TS 26.258 vj10 | IVAS Codec Floating-Point C Code Specification | Rel-19 |
| TS 26.260 vj00 | Immersive Audio Objective Test Methods | Rel-19 |
| TS 26.261 vj00 | Electro-acoustic specs for immersive terminals | Rel-19 |
| TS 26.501 vj30 | 5G Media Streaming (5GMS) Architecture | Rel-19 |
| TR 26.918 vj00 | Virtual Reality Relevance Study for 3GPP | Rel-19 |
| TR 26.997 vj00 | IVAS Codec Specification | Rel-19 |
| TS 28.541 vk00 | 5G Network Resource Model (NRM) Stage 2/3 | Rel-20 |
| TS 29.309 vj10 | Nbsp Service Based Interface for GBA BSF | Rel-19 |
| TR 29.829 vh10 | SMS Service-Based Interfaces for 5G Core | Rel-17 |
| TS 33.117 vk00 | Catalogue of General Security Assurance Requirements | Rel-20 |
| TS 33.514 vk00 | 5G Security Assurance for UDM | Rel-20 |
| TS 33.794 vj10 | Study on Zero Trust Security Enablers for 5G | Rel-19 |
| TS 33.835 vg10 | Study on authentication and key management for apps | Rel-16 |
| TR 33.841 vg10 | Security aspects; Study on 256-bit algorithms for 5G | Rel-16 |
| TR 33.848 vi00 | Technical Report on Virtualisation Security | Rel-18 |