Description
Speech Enabled Services (SES) is a broad term within 3GPP that refers to the standardization of packet-switched voice and speech services, primarily for LTE (as VoLTE) and 5G NR (as VoNR). It defines the end-to-end architecture, protocols, and codecs required to deliver carrier-grade voice over IP (VoIP) on IMS-based networks. The core of SES is the IP Multimedia Subsystem (IMS), which provides the control plane for session establishment, management, and feature invocation (like call waiting and supplementary services). The user plane carries the actual voice packets using Real-time Transport Protocol (RTP).
A key technical component of SES is the set of advanced speech and audio codecs it standardizes. This includes the Enhanced Voice Services (EVS) codec, introduced in 3GPP Release 12, which provides super-wideband and fullband audio quality and is highly robust to packet loss. SES specifications cover the entire voice chain: from the terminal's speech encoder/decoder, through the radio access and core network bearers with appropriate Quality of Service (QoS) – typically a dedicated QoS Class Identifier (QCI) for guaranteed bit rate – to the IMS core and interconnection with other networks (like PSTN via MGCF).
SES also defines critical procedures for service continuity, such as Single Radio Voice Call Continuity (SRVCC), which allows a voice call to handover from a packet-switched network (LTE/5G) to a legacy circuit-switched network (2G/3G) when the user moves out of coverage. For 5G, SES ensures voice service is natively supported on the 5G Core (5GC) with Voice over NR (VoNR), utilizing the same IMS core. The specifications detail the interaction between the UE, the RAN, the 5GC (specifically the AMF and SMF), and the IMS to establish an IMS voice bearer with the proper QoS flows. SES is not a single service but a framework ensuring that speech services meet strict requirements for delay, jitter, reliability, and audio quality in all-IP networks.
Purpose & Motivation
SES was developed to solve the fundamental challenge of delivering traditional, high-quality circuit-switched-like voice services over new, all-packet-switched mobile network architectures like LTE and 5G NR. Early LTE deployments were data-only, requiring a fallback to 3G circuits (CSFB) for voice, which was a suboptimal user experience. The purpose of SES, spearheaded by VoLTE, was to define a standardized, interoperable, and superior IP-based voice solution that could become the primary voice service.
It addressed the limitations of over-the-top (OTT) VoIP services by ensuring tight integration with the mobile network. This integration provides key operator-controlled advantages: guaranteed QoS with priority on the radio and transport networks, seamless mobility and handover (including to circuit-switched), integration with operator supplementary services (like call forwarding), and emergency service support (e.g., location reporting for E911). Furthermore, SES aimed to improve voice quality beyond legacy narrowband calls by mandating support for wideband (HD Voice) and later super-wideband/fullband audio through the EVS codec.
The evolution into 5G required the continuation of this framework. The purpose expanded to ensure voice service is an integral, not bolted-on, part of 5G. VoNR under the SES umbrella ensures that 5G networks can deliver ultra-reliable low-latency communication (URLLC) for voice, supports new 5G core capabilities like network slicing for a dedicated voice slice, and maintains backward compatibility and service continuity with 4G VoLTE and 3G circuits. SES thus provides the roadmap for the long-term evolution of telephony in mobile networks.
Key Features
- Standardization of IMS-based Voice over LTE (VoLTE) and Voice over NR (VoNR)
- Specification of the Enhanced Voice Services (EVS) super-wideband codec
- End-to-end QoS management with dedicated bearers (QCI 1 for voice)
- Service continuity mechanisms like SRVCC and PS-PS handover
- Support for rich communication services (RCS) and supplementary services
- Emergency services (e.g., IMS Emergency) support over packet networks
Evolution Across Releases
Laid the initial groundwork for IMS-based speech services in LTE, defining the basic IMS profile for voice and SMS. This included the initial specifications for VoLTE architecture, though comprehensive profiles and interoperability specifications came in later releases.
Defining Specifications
| Specification | Title |
|---|---|
| TS 26.177 | 3GPP TS 26.177 |
| TS 26.235 | 3GPP TS 26.235 |
| TS 26.236 | 3GPP TS 26.236 |
| TS 26.943 | 3GPP TS 26.943 |