MOS (Mean Opinion Score) — 3GPP Glossary

A standardized, subjective measure of perceived human quality for voice, audio, or video services. It is derived from averaging the opinions of many human listeners or viewers who rate quality on a scale from 1 (bad) to 5 (excellent). MOS is the cornerstone metric for quantifying and benchmarking the Quality of Experience (QoE) for multimedia telecommunication services.

Description

The Mean Opinion Score (MOS) is a fundamental quantitative measure of the perceived quality of a transmitted speech, audio, or video signal as experienced by the end-user. It is defined in the ITU-T P-series recommendations and extensively adopted and elaborated within 3GPP specifications for service quality benchmarking. The core methodology involves a subjective listening or viewing test where a panel of human subjects rates the quality of processed audio/video samples under controlled conditions. Each subject provides an opinion score on an absolute category rating scale, typically: 5=Excellent, 4=Good, 3=Fair, 2=Poor, 1=Bad. The MOS is the arithmetic mean of all scores collected for a given test condition, resulting in a single number between 1.0 and 5.0.

The process for obtaining a MOS is highly standardized to ensure reproducibility and comparability. Test subjects are selected based on specific criteria (e.g., normal hearing), and they listen to or view sequences of processed media in a quiet room using standardized equipment. The test material includes clean source samples and samples that have been degraded by a codec, packet loss, delay, jitter, or other network impairments. The subjects are not experts but represent typical users. The rigorous control of environmental, equipment, and procedural variables is essential to isolate the impact of the system under test on perceived quality. The resulting MOS provides a direct, human-centric link between technical network parameters and user satisfaction.

Within the 3GPP architecture, MOS is not a directly measurable network Key Performance Indicator (KPI) but is the ultimate benchmark against which objective prediction models are calibrated. 3GPP specs define performance requirements for codecs, network delay, packet loss robustness, and other factors with the goal of achieving a target MOS under specific conditions. For example, a specification may require that a voice codec achieves a MOS of 4.0 or higher under a defined packet loss scenario. Network planning, codec selection, and QoS parameter tuning are all ultimately aimed at maximizing the MOS for the end-user. It bridges the gap between engineering metrics (e.g., latency, jitter) and the subjective human experience, making it indispensable for service quality management and standardization.

Purpose & Motivation

MOS was created to solve the fundamental problem of quantifying something inherently subjective: the quality of a communication experience. Before its standardization, comparing voice codecs or network performance was difficult and often based on inconsistent, proprietary listening tests. Engineers needed a reliable, repeatable method to evaluate how technical impairments (like bandwidth limitation, compression, or packet loss) actually impacted user perception. The MOS provided this common currency for quality.

Its adoption by 3GPP and other standards bodies was motivated by the need to define minimum performance requirements for digital cellular systems (GSM, 3G, 4G, 5G). As networks evolved from analog to digital, and introduced lossy compression codecs to save bandwidth, it became critical to ensure that these technological advances did not degrade voice quality below an acceptable threshold. MOS provided the definitive scale to set these thresholds (e.g., 'toll quality' is often associated with a MOS ~4.0). It allows for the objective comparison of completely different technological approaches (e.g., AMR-NB vs. EVS codec) based on their ultimate impact on the human listener, driving continuous improvement in speech and audio quality for mobile services.

Key Features

Subjective quality measurement based on human perception
Uses a standardized 5-point Absolute Category Rating (ACR) scale
Derived from statistically significant listener/viewer panels under controlled lab conditions
Serves as the ground-truth benchmark for calibrating objective quality prediction models (e.g., PESQ, POLQA)
Defines quality targets for codec and network performance in 3GPP specifications
Applicable to speech, audio, video, and audiovisual multimedia services

Evolution Across Releases

Rel-4 Initial

Formally introduced MOS into 3GPP specifications as a key performance metric for speech quality evaluation, particularly for the evolving 3G/UMTS services. It established the methodology for setting speech quality requirements based on MOS, providing a common framework to assess and compare the performance of circuit-switched and early packet-switched voice codecs and transport.

TS 21.905 TS 22.925 TS 26.077 TS 26.247 TS 26.906 TS 26.909 TS 26.921 TS 26.926 TS 26.935 TS 26.936 TS 26.938 TS 26.952 TS 26.955 TS 26.975 TS 26.976 TS 26.978 TS 26.989 TS 29.520 TS 46.008 TS 46.055

Defining Specifications

Specification	Title
TS 21.905	3GPP TS 21.905
TS 22.925	3GPP TS 22.925
TS 26.077	3GPP TS 26.077
TS 26.247	3GPP TS 26.247
TS 26.906	3GPP TS 26.906
TS 26.909	3GPP TS 26.909
TS 26.921	3GPP TS 26.921
TS 26.926	3GPP TS 26.926
TS 26.935	3GPP TS 26.935
TS 26.936	3GPP TS 26.936
TS 26.938	3GPP TS 26.938
TS 26.952	3GPP TS 26.952
TS 26.955	3GPP TS 26.955
TS 26.975	3GPP TS 26.975
TS 26.976	3GPP TS 26.976
TS 26.978	3GPP TS 26.978
TS 26.989	3GPP TS 26.989
TS 29.520	3GPP TS 29.520
TS 46.008	3GPP TR 46.008
TS 46.055	3GPP TR 46.055