3DOF

Three Degrees of Freedom

Services
Introduced in Rel-15
3DOF refers to a basic level of spatial audio and visual immersion in media services, typically supporting head rotation (yaw, pitch, roll) without positional tracking. It enables initial immersive experiences like 360-degree video and basic VR applications within 3GPP media delivery frameworks, providing a foundation for more advanced XR services.

Description

Within the 3GPP ecosystem, 3DOF (Three Degrees of Freedom) defines a media service capability for delivering immersive audio-visual content where the consumer's viewpoint can rotate in three angular dimensions but cannot translationally move within the virtual environment. The three rotational axes are yaw (left/right rotation around a vertical axis), pitch (up/down rotation around a lateral axis), and roll (tilting rotation around a forward/backward axis). This is fundamentally implemented through the delivery and rendering of omnidirectional (360-degree) media, where the full spherical field of view is captured or generated, and the client application or device renders only a viewport based on the user's current head orientation.

Technically, 3GPP specifications such as TS 26.114 (Multimedia Telephony) and TS 26.118 (Immersive Telephony) define the protocols and codecs for transporting 3DOF media. The content is typically encoded using projection formats like Equirectangular Projection (ERP) or Cubemap Projection (CMP) to map the spherical video onto a 2D plane for compression with standard video codecs like HEVC or VVC. Accompanying spatial audio is defined in specs like TS 26.918 (XR Media) to match the visual viewpoint, often using formats like Scene-based Audio (e.g., Higher Order Ambisonics - HOA) or Object-based Audio with associated metadata. The Media Presentation Description (MPD) in DASH (Dynamic Adaptive Streaming over HTTP) is extended to signal 3DOF characteristics, enabling adaptive streaming of different quality segments for different spatial regions (viewport-dependent streaming) to optimize bandwidth.

The architecture involves a 3DOF media server, which stores and streams the projected video and spatial audio assets, and a 3DOF client, which could be a smartphone in a head-mounted viewer or a standalone VR device with orientation sensors. The client decodes the video, extracts the appropriate viewport based on real-time sensor input for head orientation, and renders it on the display. The audio renderer uses head-related transfer functions (HRTFs) to binaurally render the spatial audio scene corresponding to the current head orientation. Key network components include the 5G Media Streaming (5GMS) framework, which leverages 5G network capabilities like high throughput and low latency to ensure a smooth, high-quality 3DOF streaming experience without motion sickness-inducing lag.

3DOF's role is as an entry-level immersive media service within the broader Extended Reality (XR) continuum defined by 3GPP. It provides a foundational user experience and technical framework upon which more complex services like 6DOF (Six Degrees of Freedom) are built. It is crucial for applications like virtual tourism, basic training simulations, and 360-degree live events, where full positional freedom is not required but a sense of presence is desired.

Purpose & Motivation

3DOF was introduced to standardize the delivery of basic immersive media over mobile networks, addressing the market emergence of affordable VR viewers and 360-degree cameras. Prior to 3GPP standardization, proprietary solutions for streaming 360-degree video existed, leading to fragmentation, interoperability issues, and inefficient use of network resources. The lack of a unified approach hindered widespread service deployment by operators and content providers. 3GPP's work aimed to create a scalable, adaptive, and quality-managed ecosystem for immersive media, leveraging existing multimedia frameworks like DASH and extending them for spatial characteristics.

The primary problem 3DOF solves is enabling a compelling, standardized immersive experience within the constraints of current consumer hardware (which often lacks positional tracking) and mobile network bandwidth. It allows operators to offer new media services that go beyond traditional flat video, creating new revenue streams in entertainment, education, and enterprise. By defining efficient projection and compression methods, along with viewport-adaptive streaming, it addresses the significant bandwidth challenge of 360-degree video (which requires 4-6 times the pixels of a standard HD viewport) without requiring consumers to download the entire ultra-high-resolution sphere at full quality.

Historically, 3DOF represents the first step in 3GPP's formalization of XR services, initiated in Release 15 alongside early 5G deployments. It was motivated by the need to demonstrate 5G's value beyond enhanced mobile broadband, showcasing its ability to support new media formats requiring high data rates and consistent quality. By establishing 3DOF, 3GPP provided a clear evolutionary path from traditional media to fully interactive 6DOF XR, allowing the industry to develop and deploy services incrementally while the underlying device capabilities and network optimizations (like edge computing for more complex rendering) continued to mature.

Key Features

  • Supports three rotational movements (yaw, pitch, roll) for viewpoint control
  • Utilizes spherical video projection formats (e.g., ERP, CMP) for efficient encoding and storage
  • Enables viewport-adaptive streaming to optimize bandwidth by delivering higher quality only to the user's current field of view
  • Integrates spatial audio formats synchronized with visual viewpoint orientation
  • Leverages standardized 5G Media Streaming (5GMS) framework for delivery and quality assurance
  • Provides foundational metadata and signaling in DASH MPD for client-driven adaptive playback of immersive content

Evolution Across Releases

Rel-15 Initial

Introduced initial support for 3DOF media services within the 5G Media Streaming framework. Specifications defined baseline requirements for 360-degree video streaming, including viewport-adaptive streaming using DASH, and began work on spatial audio formats. This release established the fundamental architecture for delivering omnidirectional media over 5G networks.

Enhanced 3DOF support with improved codec efficiency for immersive video, including updates for HEVC and initiation of work on VVC. Introduced more detailed metadata for viewport-dependent processing and refined quality of experience metrics for 360-degree video streaming.

Formalized the broader XR framework, positioning 3DOF as a specific service profile within it. Specifications like TS 26.918 (XR Media) were developed, providing comprehensive technical details for 3DOF media formats, streaming, and rendering. Enhanced support for low-latency streaming crucial for interactive VR.

Further optimizations for energy-efficient 3DOF streaming and rendering on devices. Worked on improved integration with network APIs for enhanced quality of service management. Explored advanced audio-visual codec profiles specifically tuned for immersive media delivery.

Continued evolution within the XR work item, focusing on interoperability and refined performance requirements for mass-market 3DOF services. Addressed scalability for multicast/broadcast delivery of live 3DOF events and further enhancements to the media adaptation logic.

Defining Specifications

SpecificationTitle
TS 26.114 3GPP TS 26.114
TS 26.118 3GPP TS 26.118
TS 26.918 3GPP TS 26.918
TS 26.929 3GPP TS 26.929