Description
OMAF, standardized as ISO/IEC 23090-2 and adopted by 3GPP in TS 26.114 and related specs, is a comprehensive media format designed specifically for omnidirectional (360-degree) media. It builds upon existing media foundations like ISO Base Media File Format (ISOBMFF) and High Efficiency Video Coding (HEVC) but adds critical extensions for spherical video. The core of OMAF is the definition of a coordinate system and projection methods to map the 360-degree spherical video onto a 2D rectangular video frame for efficient encoding and storage. Common projections include Equirectangular Projection (ERP) and Cubemap Projection (CMP).
Architecturally, OMAF defines a media processing pipeline. On the creation side, the spherical video is projected, encoded using HEVC (and optionally AVC), and packaged into ISOBMFF segments with OMAF-specific metadata. This metadata includes essential information like the Projection Format, Region-Wise Packing (which maps spherical regions to the 2D frame), and the Initial Viewing Orientation. For delivery, OMAF supports Dynamic Adaptive Streaming over HTTP (DASH), where the media is divided into segments of multiple quality levels (bitrates). A key feature is viewport-dependent streaming, where the client requests higher quality segments only for the portion of the sphere currently in the user's field of view (viewport), saving bandwidth.
On the client/player side, the OMAF player demultiplexes the stream, reads the OMAF metadata, decodes the video, and performs the inverse projection to re-render the spherical video for display, typically on a head-mounted display (HMD) or a smartphone screen used with a VR viewer. It manages viewport tracking and adapts the streaming requests accordingly. OMAF also specifies audio formats, including channel-based, scene-based (Ambisonics), and object-based audio, to accompany the 360-degree video for a fully immersive experience. Its role in the 5G network is as a key application-layer format that leverages 5G's high bandwidth and low latency for delivering immersive media services.
Purpose & Motivation
OMAF was created to address the lack of standardization in the rapidly emerging field of 360-degree and virtual reality (VR) video. Before OMAF, content creators, service providers, and device manufacturers used proprietary or incompatible formats for capturing, encoding, and streaming spherical video. This fragmentation threatened to stifle the growth of immersive media by creating walled gardens where content from one provider might not play on another's device, similar to the early days of mobile video.
The primary problem OMAF solves is ensuring interoperability. It provides a single, agreed-upon format that guarantees a piece of OMAF-compliant 360-degree content will play correctly on any OMAF-compliant player, regardless of the vendor. This reduces complexity and cost for the ecosystem. Furthermore, OMAF addresses the significant technical challenge of bandwidth. A full high-resolution 360-degree video requires enormous data rates if streamed in full quality at all times. OMAF's viewport-dependent streaming mechanism is a key innovation that solves this by dynamically streaming high quality only where the user is looking, making immersive video services feasible over mobile networks like 4G and 5G.
Its creation was motivated by the industry's move towards immersive experiences as part of 5G use cases, such as enhanced Mobile Broadband (eMBB). Standard bodies like MPEG and 3GPP collaborated to ensure the format was optimized for both storage/broadcast and adaptive streaming over IP networks. By being part of 3GPP specifications, OMAF is directly integrated into the 5G media delivery architecture, enabling operators to offer standardized, high-quality VR/360 video services.
Key Features
- Standardized spherical video projection formats (ERP, CMP) and metadata
- Viewport-dependent adaptive streaming using DASH for bandwidth efficiency
- Integration with HEVC/H.265 and AVC/H.264 video codecs
- ISOBMFF-based file format and segmentation for streaming
- Support for immersive audio formats (Ambisonics, object audio)
- Defines initial viewport, region-wise packing, and coordinate systems
Evolution Across Releases
Initial standardization of OMAF in 3GPP as part of 5G Phase 1. It defined the core architecture for 360-degree video, including the media format based on HEVC, Equirectangular and Cubemap projections, basic viewport-independent streaming, and integration with 3GPP's Packet Switched Streaming (PSS) and Multimedia Broadcast/Multicast Service (MBMS) frameworks. This established the foundation for interoperable immersive media services.
Enhanced OMAF with support for viewport-dependent streaming using DASH, significantly improving bandwidth efficiency. Introduced more sophisticated projection formats, improved region-wise packing schemes, and added support for overlay graphics (e.g., subtitles, menus) on the spherical video. Audio capabilities were expanded for more immersive experiences.
Further refinements for volumetric and six Degrees of Freedom (6DoF) media, extending beyond basic 360-degree video. Introduced support for multi-view and layered coding to enable more advanced immersive experiences. Enhanced the metadata and signaling for complex media scenes.
Defining Specifications
| Specification | Title |
|---|---|
| TS 26.114 | 3GPP TS 26.114 |
| TS 26.118 | 3GPP TS 26.118 |
| TS 26.511 | 3GPP TS 26.511 |
| TS 26.818 | 3GPP TS 26.818 |
| TS 26.862 | 3GPP TS 26.862 |
| TS 26.891 | 3GPP TS 26.891 |
| TS 26.918 | 3GPP TS 26.918 |
| TS 26.919 | 3GPP TS 26.919 |
| TS 26.926 | 3GPP TS 26.926 |
| TS 26.962 | 3GPP TS 26.962 |
| TS 26.998 | 3GPP TS 26.998 |
| TS 26.999 | 3GPP TS 26.999 |