Description
Supplemental Enhancement Information (SEI) is a standardized mechanism for embedding auxiliary metadata within a video bitstream, as defined by video coding standards adopted by 3GPP for multimedia services. It is part of the syntax of codecs like H.264/AVC (Advanced Video Coding) and H.265/HEVC (High Efficiency Video Coding). SEI messages are contained in the Network Abstraction Layer (NAL) units of the bitstream. They carry information that is not required for the normative decoding process to produce correct sample values but is beneficial for improving the user experience, guiding display processes, enabling advanced features, or providing debugging information.
SEI messages work by being inserted at the encoder and transported alongside the coded video slices. The decoder parses these messages and can use the information they contain. The syntax and semantics of SEI messages are rigorously defined. Each SEI message has a specific payload type and payload structure. Common examples include "pic_timing" which provides clock timestamp information for synchronizing decoding and display, "user_data_registered_itu_t_t35" for carrying vendor-specific or application-specific data, and "recovery_point" which signals how to recover from errors or seek points. The messages are unidirectional (from encoder to decoder) and do not require acknowledgment.
From a 3GPP network perspective, SEI messages are treated as part of the video user plane. During multimedia session establishment via protocols like SIP and SDP, the support for certain SEI messages can be negotiated. The core network and radio access network transparently transport the bitstream containing SEI. However, media-aware network elements, like a Media Resource Function (MRF) or application servers, might parse and even generate or modify SEI messages to adapt content for different devices or network conditions. For example, an MRF might insert frame packing arrangement SEI messages for stereoscopic 3D video services.
The role of SEI is crucial for enabling enhanced video services over mobile networks. It supports features like dynamic adaptation, trick modes (fast forward, rewind), color volume transformation (HDR), and content description. By standardizing this metadata carriage, SEI ensures interoperability between different vendors' encoders, decoders, and middleware, allowing for a rich ecosystem of video applications within the 3GPP multimedia framework.
Purpose & Motivation
SEI exists to solve the problem of conveying non-essential but highly valuable control and descriptive information alongside the compressed video essence, without altering the core decoding standard. Before its adoption, such auxiliary data had to be sent out-of-band (e.g., in a separate RTP header extension or a completely different channel), complicating synchronization and increasing system complexity. SEI provides an in-band, standardized carriage mechanism.
The primary motivation was to enable advanced video features and improve robustness and usability. For instance, precise display timing information (pic_timing SEI) is critical for smooth playback and lip-sync in multimedia streaming. Pan-scan rectangles allow for adjusting the viewing area on different display aspect ratios. SEI messages for buffering period and recovery points are vital for adaptive streaming protocols like DASH and HLS, which are widely used in 3GPP's MBMS and streaming services. They help decoders manage buffers and recover from packet loss or seek operations efficiently.
Historically, SEI was incorporated from the ITU-T and ISO/IEC MPEG video standards into the 3GPP multimedia specifications to ensure mobile devices could interoperate with broad video ecosystems. Its creation addresses the limitations of having a video bitstream that only contains pixel data, by adding a layer of 'intelligence' that guides how the video should be handled post-decoding. This allows service providers to offer a consistent, high-quality video experience across diverse networks and devices, which is a fundamental goal of 3GPP's Packet-Switched Streaming Service (PSS) and Multimedia Broadcast/Multicast Service (MBMS).
Key Features
- In-band carriage within video bitstream NAL units
- Defined payload types for timing, display control, and user data
- Not required for normative decoding but enhances usability
- Supports vendor-specific data through registered user data messages
- Enables features like HDR metadata, frame packing for 3D, and buffering control
- Used by adaptive streaming protocols for random access and trick modes
Evolution Across Releases
Introduced with the adoption of H.264/AVC as a primary video codec for 3GPP multimedia services. SEI messages from the H.264 standard, such as pic_timing and user_data_unregistered, became part of the video bitstream syntax for services like PSS and MBMS, enabling basic enhancement information transport.
Defining Specifications
| Specification | Title |
|---|---|
| TS 26.114 | 3GPP TS 26.114 |
| TS 26.116 | 3GPP TS 26.116 |
| TS 26.118 | 3GPP TS 26.118 |
| TS 26.223 | 3GPP TS 26.223 |
| TS 26.346 | 3GPP TS 26.346 |
| TS 26.804 | 3GPP TS 26.804 |
| TS 26.855 | 3GPP TS 26.855 |
| TS 26.862 | 3GPP TS 26.862 |
| TS 26.906 | 3GPP TS 26.906 |
| TS 26.919 | 3GPP TS 26.919 |
| TS 26.946 | 3GPP TS 26.946 |
| TS 26.948 | 3GPP TS 26.948 |
| TS 26.949 | 3GPP TS 26.949 |
| TS 26.955 | 3GPP TS 26.955 |
| TS 26.962 | 3GPP TS 26.962 |