Description
Multi-stream Multiparty Conferencing Media Handling (MMCMH) is a sophisticated functionality of the Media Resource Function (MRF) within the IP Multimedia Subsystem (IMS), specifically tailored for next-generation conferencing services. In a traditional audio conference bridge, all participant audio streams are mixed into a single composite stream, which is then sent back to each participant. MMCMH moves beyond this simple mix-and-deliver model. Instead, it allows the MRF to receive, process, and forward multiple discrete media streams to and from each conference participant. This per-stream handling is crucial for supporting advanced conferencing features, especially those involving video and spatial audio, in large-scale multiparty calls.
The architecture involves an Application Server (AS) that hosts the conferencing service logic and controls an MRF via media control protocols such as H.248 or the Media Control API framework. The MRF implementing MMCMH acts as a powerful media processor. For each participant, it can maintain separate incoming and outgoing streams for different media types. For example, in a video conference, it might receive a unique video stream from each participant. The MRF can then perform selective forwarding, sending a specific set of video streams (e.g., only the active speakers) to each participant based on their screen layout or preferences, rather than a single mixed video. For audio, it enables the creation of spatial audio scenes where participants' voices appear to come from different directions in a virtual room, significantly improving intelligibility in large calls.
How MMCMH works is defined in the 3GPP specifications TS 23.333 (Procedures) and TS 29.333 (Media Control API). The service logic in the AS determines the conferencing policy (who can speak, what layout to use) and sends directives to the MRF. The MRF, using its MMCMH capability, executes these directives on the media plane. It can transcode between different codecs, insert tones or announcements, record the conference, and apply real-time media analytics. This decoupling of service control (in the AS) from complex media processing (in the MRF) allows for scalable and flexible service deployment. The MRF can be optimized for high-performance media manipulation, while the AS can be developed independently to create innovative user experiences.
MMCMH's role is pivotal in enabling 5G-era conferencing services that demand high quality, low latency, and rich interactivity. It supports use cases like massive virtual meetings, interactive tele-education, and cloud gaming with live commentary. By handling streams individually, it allows for network and client efficiencies; for instance, a participant on a poor connection might receive only an audio stream and a single video stream, while another on a 5G connection receives all video streams in high definition. This granular control is a significant evolution from the monolithic conferencing bridges of the past and is a key enabler for the Media Processing (MP) capabilities within the 5G Media Streaming architecture.
Purpose & Motivation
MMCMH was created to address the limitations of traditional conference mixing, which becomes inadequate for modern, large-scale, and media-rich collaborative sessions. Simple audio mixing suffers from quality degradation, loss of speaker identity, and an inability to support features like active speaker identification or whisper rooms. For video, simply compositing all feeds into one grid at the server is inefficient and inflexible, as it forces the same video layout on all clients regardless of their device capabilities or user preferences. The purpose of MMCMH is to provide a standardized, powerful media handling framework within IMS that can meet the demands of advanced, interactive multiparty services.
The historical context includes the growth of over-the-top (OTT) conferencing solutions that offered per-stream control and innovative features, creating user expectations that legacy telecom conferencing services could not meet. 3GPP standardized MMCMH to allow network operators and service providers to offer competitive, carrier-grade conferencing with superior reliability, security, and integration with other IMS services (like emergency calling). It solves the problem of scalability and quality in large conferences by allowing intelligent, selective distribution of media rather than brute-force mixing.
Furthermore, MMCMH enables new business models and immersive experiences. It is a foundational technology for Augmented Reality (AR)/Virtual Reality (VR) social spaces and metaverse applications where spatial audio and independent video streams are essential for realism. By defining these capabilities in the 5G standards, 3GPP ensures interoperability between MRFs from different vendors and provides a clear path for developers to create next-generation communication services that leverage the high bandwidth and low latency of 5G networks. Its creation was motivated by the vision of making telecom networks a premier platform for real-time, interactive group communication.
Key Features
- Enables per-participant, per-media-stream processing in a conference, moving beyond monolithic mixing
- Supports selective forwarding of audio and video streams based on active speakers, user roles, or client capabilities
- Facilitates the creation of spatial audio scenes for immersive conference experiences
- Allows dynamic insertion of media (tones, recordings, announcements) into individual participant streams
- Provides media transcoding and adaptation between different codecs and formats for heterogeneous endpoints
- Exposes control via standardized APIs (e.g., Media Control API in TS 29.333) for flexible service development
Evolution Across Releases
Introduced as a new Media Resource Function (MRF) capability for advanced IMS-based conferencing. Specified in TS 23.333 and TS 29.333, it defined the architecture and Media Control API extensions required for Multi-stream Multiparty Conferencing Media Handling, enabling per-stream control for scalable and feature-rich conference services.
Defining Specifications
| Specification | Title |
|---|---|
| TS 23.333 | 3GPP TS 23.333 |
| TS 29.333 | 3GPP TS 29.333 |