ROI

Region of Interest

Services
Introduced in Rel-13
A concept in multimedia services where a specific spatial or temporal area within media content is identified for enhanced processing or delivery. It enables bandwidth-efficient streaming by prioritizing critical parts of a video, such as a speaker's face, for higher quality while reducing detail in less important background areas.

Description

Region of Interest (ROI) is a feature within 3GPP multimedia standards, particularly relevant for video coding and streaming services like Multimedia Broadcast/Multicast Service (MBMS) and evolved Multimedia Broadcast Multicast Service (eMBMS). It operates by allowing the content provider or encoder to define specific spatial regions within a video frame, or temporal segments within a media timeline, that are deemed more important to the user experience. These regions are then tagged with metadata that instructs the network and receiving devices to apply differentiated treatment.

Architecturally, ROI support is integrated into the media processing and delivery chain. At the encoding stage, codecs compliant with standards like HEVC (H.265) or AVC (H.264) can generate bitstreams where certain slices or tiles correspond to the ROI, often encoded with higher fidelity (e.g., higher quantization parameter). The service layer, described in specs like 23.333 (ProSe) and 26.114 (codec), defines the signaling and carriage of ROI metadata. This metadata can be transported via mechanisms like the Media Presentation Description (MPD) in Dynamic Adaptive Streaming over HTTP (DASH).

In the delivery phase, particularly for broadcast/multicast services specified in 29.333 (MBMS) and 29.334 (MBMS Upstream), the network can utilize this information for resource optimization. While the core network and RAN typically deliver the entire media stream, the ROI metadata enables client-side applications or specialized middleware to prioritize decoding and rendering resources for the highlighted region. In advanced implementations, it could theoretically inform radio resource allocation for layered coding schemes, though primary application is end-to-end between encoder and player.

Its role is to enhance perceived quality and efficiency in bandwidth-constrained scenarios common in mobile networks. By focusing bits on semantically important content, ROI improves the quality of experience (QoE) for applications like sports broadcasting (focus on player), video conferencing (focus on speaker), or augmented reality, without proportionally increasing overall bitrate.

Purpose & Motivation

ROI technology was introduced to address the fundamental challenge of delivering high-quality video over mobile networks with limited and variable bandwidth. Prior approaches applied uniform encoding quality across entire video frames, which is inefficient because human perception prioritizes certain areas (like faces or moving objects). This led to either excessive bandwidth consumption for high-quality full-frame video or uniformly poor quality when bandwidth was restricted.

The creation of ROI, formalized in 3GPP Release 13 alongside enhancements for eMBMS and mission-critical video, was motivated by the growth of rich media services and the need for smarter content-aware delivery. It solves the problem of optimizing the user's quality of experience within given network constraints. By enabling content-adaptive encoding, it allows service providers to offer subjectively better video quality at the same bitrate or maintain acceptable quality at lower bitrates, improving network capacity utilization.

Historically, its development aligns with advancements in video codec features and the industry's push for more efficient broadcast services for public safety, live events, and automotive scenarios. It addresses limitations of previous 'one-size-fits-all' streaming by introducing a level of semantic awareness into the media delivery chain, making adaptive bitrate streaming more intelligent and perceptually optimized.

Key Features

  • Enables definition of spatial regions (e.g., rectangular areas) or temporal segments within media for prioritized encoding
  • Supports signaling via metadata in streaming formats like DASH (MPD) and broadcast service announcements
  • Facilitates bandwidth efficiency by allowing higher quality encoding for ROI versus background
  • Integrates with advanced video codecs (HEVC, AVC) that support region-based quality differentiation
  • Enhances user quality of experience (QoE) for applications like surveillance, sports, and video conferencing
  • Can be used in conjunction with scalable or layered coding for graceful degradation

Evolution Across Releases

Rel-13 Initial

Introduced ROI as a formal concept within 3GPP, primarily in the context of MBMS and Proximity Services (ProSe). Specified initial support for signaling ROI information in media descriptions and service metadata to enable content-aware broadcast and group communication services.

Defining Specifications

SpecificationTitle
TS 23.333 3GPP TS 23.333
TS 23.334 3GPP TS 23.334
TS 26.114 3GPP TS 26.114
TS 29.162 3GPP TS 29.162
TS 29.238 3GPP TS 29.238
TS 29.333 3GPP TS 29.333
TS 29.334 3GPP TS 29.334