Description
MASA1, or mono-MASA, is a defined profile within the Metadata-Assisted Spatial Audio standard where the underlying audio essence is encoded as a single mono transport channel (TC). This single audio channel contains the sum of all audio elements in the scene. The complete spatial information—including the position, width, and movement of individual sound sources—is carried exclusively within the separate MASA metadata stream. During decoding, the receiver's renderer processes the mono audio signal and applies the spatial parameters from the metadata to create a multi-dimensional, immersive sound image. The architecture leverages advanced signal processing techniques, such as filtering and amplitude panning, dictated by the metadata to 'place' sounds around the listener. Key components are identical to the base MASA framework but operate with a single-audio-channel input. Its role is to provide the most spectrally efficient method for delivering basic spatial audio scenes, ideal for applications where audio complexity is lower or bandwidth is at a premium, while still offering a significant upgrade over plain mono audio by adding a spatial dimension.
Purpose & Motivation
MASA1 was defined to serve as a baseline, minimal-complexity profile for the MASA ecosystem. It addresses the need for adding spatial immersion to services where the source audio might originally be mono (e.g., a podcast, a voice call, or a legacy audio asset) or where ultra-low bitrate is the primary constraint. Before MASA, upgrading a mono service to spatial audio would require completely re-encoding the content into a multi-channel or object-based format, significantly increasing bitrate. MASA1 solves this by allowing the existing mono audio to be repurposed with lightweight metadata, enabling a spatial experience without the bandwidth cost of transmitting multiple discrete audio channels. It provides a clear migration path for service providers to enhance existing mono-based services with immersive features.
Key Features
- Core audio consists of a single mono transport channel (TC)
- Full spatial scene description delegated to the metadata track
- Lowest bitrate requirement among MASA profiles
- Enables spatialization of legacy mono content
- Uses HRTF-based or panning-based rendering for headphone and speaker playback
- Serves as the foundation for more complex MASA profiles
Evolution Across Releases
Defining Specifications
| Specification | Title |
|---|---|
| TS 26.253 | 3GPP TS 26.253 |