MP4

MPEG-4 Part 14 File Format

Services
Introduced in Rel-8
MP4 is a standardized multimedia container format defined by 3GPP for storing audio, video, and related data. It is crucial for mobile media services like streaming and download, ensuring interoperability between devices and networks. Its structured nature enables efficient storage, transmission, and playback of rich media content.

Description

The MP4 file format, technically MPEG-4 Part 14, is a versatile and structured container format for digital multimedia. Within the 3GPP ecosystem, it is specified to ensure that multimedia content—such as video, audio, subtitles, and metadata—can be packaged, transmitted, and consumed consistently across diverse mobile devices and network infrastructures. The format is based on the ISO Base Media File Format (ISOBMFF), which uses a hierarchical structure of 'boxes' or 'atoms'. Each box is a discrete data object with a defined type and length, containing either media data or metadata describing that data. Key structural boxes include the 'moov' box (movie metadata, containing track and timing information), the 'mdat' box (the actual media data samples), and the 'ftyp' box (file type and compatibility). This box-based architecture allows for efficient parsing, random access for seeking, and support for advanced features like streaming, trick modes, and content protection.

In operation, an MP4 file encapsulates one or more media tracks. Each track is independently coded (e.g., using H.264/AVC for video or AAC for audio) and contains a sequence of samples (frames). The 'moov' box holds a comprehensive index, the Sample Table, which maps each sample to its byte offset, size, and presentation timestamp within the 'mdat' box or even in external files. This separation of metadata and data is fundamental for progressive download and streaming protocols like Dynamic Adaptive Streaming over HTTP (DASH), where the 'moov' box can be fetched first to initialize playback while media data is downloaded in segments. The format also supports fragmentation, where a presentation is split into a series of movie fragments, each with its own metadata, enabling live streaming and efficient recording.

The role of MP4 in 3GPP networks is central to multimedia services defined in the Packet-Switched Streaming Service (PSS) and Multimedia Broadcast/Multicast Service (MBMS). 3GPP specifications such as TS 26.244 define the application of the MP4 format for these services, specifying constraints and extensions for mobile use. For instance, they mandate specific codec profiles and levels, define how to signal content using Session Description Protocol (SDP), and ensure compatibility with 3GPP-defined streaming and download procedures. The format's support for timed text (subtitles), chapter markers, and multiple audio tracks makes it suitable for rich media applications. Its standardized nature is a cornerstone for interoperability, allowing content encoded once to be played back on any compliant device, from smartphones to tablets, across 2G, 3G, 4G, and 5G networks.

Purpose & Motivation

The MP4 format was adopted and specified by 3GPP to solve the critical problem of multimedia interoperability in mobile environments. Prior to its standardization, a plethora of proprietary and incompatible media formats existed, creating fragmentation. This hindered the development of a seamless, large-scale mobile media ecosystem where content providers could deliver services reliably to any subscriber's device. The MPEG-4 standard, and specifically the MP4 container, offered a robust, internationally recognized solution that could encapsulate state-of-the-art video and audio codecs like H.264 and AAC, which were also being standardized by 3GPP for their efficiency.

The primary motivation was to enable advanced mobile services such as video telephony, streaming, and content download, which were key value-added services for 3G (UMTS) networks and beyond. The format needed to be efficient for storage and transmission over bandwidth-constrained radio links, support features like fast start and seek for a good user experience, and be flexible enough to support future codecs and media types. By standardizing on MP4, 3GPP ensured that network elements (like streaming servers), handsets, and content preparation tools had a common, well-defined target, reducing complexity and cost for the entire industry.

Furthermore, the MP4 format's design aligned perfectly with the evolution towards IP-based packet services. Its structure is inherently suitable for transmission over RTP/UDP/IP for streaming or HTTP/TCP/IP for progressive download. This made it a future-proof choice as 3GPP networks evolved from circuit-switched bearers to all-IP core networks. The format's extensibility through the box mechanism allowed 3GPP to define specific boxes for its own needs, such as carrying 3GPP metadata, ensuring the format could evolve alongside the standards themselves.

Key Features

  • Structured container based on ISO Base Media File Format (ISOBMFF) using 'boxes' for data and metadata
  • Supports multiple media tracks (video, audio, text) with independent coding and timing
  • Enables efficient streaming and progressive download through separation of metadata ('moov') and media data ('mdat')
  • Facilitates random access and seeking via comprehensive sample tables and optional movie fragmentation
  • Extensible design allowing for inclusion of future codecs, DRM systems, and metadata types
  • Standardized by 3GPP for PSS and MBMS, ensuring interoperability across all compliant mobile devices

Evolution Across Releases

Rel-8 Initial

Introduced the MP4 file format as the standard container for 3GPP multimedia services, primarily for the Packet-Switched Streaming Service (PSS). It mandated support for H.264/AVC video and AAC audio codecs within the MP4 structure, defining specific constraints and signaling methods for mobile streaming and download applications.

Defining Specifications

SpecificationTitle
TS 26.140 3GPP TS 26.140
TS 26.141 3GPP TS 26.141
TS 26.244 3GPP TS 26.244
TS 26.245 3GPP TS 26.245