MOS

Mean Opinion Score

Services
Introduced in Rel-4
A standardized, subjective measure of perceived human quality for voice, audio, or video services. It is derived from averaging the opinions of many human listeners or viewers who rate quality on a scale from 1 (bad) to 5 (excellent). MOS is the cornerstone metric for quantifying and benchmarking the Quality of Experience (QoE) for multimedia telecommunication services.

Description

The Mean Opinion Score (MOS) is a fundamental quantitative measure of the perceived quality of a transmitted speech, audio, or video signal as experienced by the end-user. It is defined in the ITU-T P-series recommendations and extensively adopted and elaborated within 3GPP specifications for service quality benchmarking. The core methodology involves a subjective listening or viewing test where a panel of human subjects rates the quality of processed audio/video samples under controlled conditions. Each subject provides an opinion score on an absolute category rating scale, typically: 5=Excellent, 4=Good, 3=Fair, 2=Poor, 1=Bad. The MOS is the arithmetic mean of all scores collected for a given test condition, resulting in a single number between 1.0 and 5.0.

The process for obtaining a MOS is highly standardized to ensure reproducibility and comparability. Test subjects are selected based on specific criteria (e.g., normal hearing), and they listen to or view sequences of processed media in a quiet room using standardized equipment. The test material includes clean source samples and samples that have been degraded by a codec, packet loss, delay, jitter, or other network impairments. The subjects are not experts but represent typical users. The rigorous control of environmental, equipment, and procedural variables is essential to isolate the impact of the system under test on perceived quality. The resulting MOS provides a direct, human-centric link between technical network parameters and user satisfaction.

Within the 3GPP architecture, MOS is not a directly measurable network Key Performance Indicator (KPI) but is the ultimate benchmark against which objective prediction models are calibrated. 3GPP specs define performance requirements for codecs, network delay, packet loss robustness, and other factors with the goal of achieving a target MOS under specific conditions. For example, a specification may require that a voice codec achieves a MOS of 4.0 or higher under a defined packet loss scenario. Network planning, codec selection, and QoS parameter tuning are all ultimately aimed at maximizing the MOS for the end-user. It bridges the gap between engineering metrics (e.g., latency, jitter) and the subjective human experience, making it indispensable for service quality management and standardization.

Purpose & Motivation

MOS was created to solve the fundamental problem of quantifying something inherently subjective: the quality of a communication experience. Before its standardization, comparing voice codecs or network performance was difficult and often based on inconsistent, proprietary listening tests. Engineers needed a reliable, repeatable method to evaluate how technical impairments (like bandwidth limitation, compression, or packet loss) actually impacted user perception. The MOS provided this common currency for quality.

Its adoption by 3GPP and other standards bodies was motivated by the need to define minimum performance requirements for digital cellular systems (GSM, 3G, 4G, 5G). As networks evolved from analog to digital, and introduced lossy compression codecs to save bandwidth, it became critical to ensure that these technological advances did not degrade voice quality below an acceptable threshold. MOS provided the definitive scale to set these thresholds (e.g., 'toll quality' is often associated with a MOS ~4.0). It allows for the objective comparison of completely different technological approaches (e.g., AMR-NB vs. EVS codec) based on their ultimate impact on the human listener, driving continuous improvement in speech and audio quality for mobile services.

Key Features

  • Subjective quality measurement based on human perception
  • Uses a standardized 5-point Absolute Category Rating (ACR) scale
  • Derived from statistically significant listener/viewer panels under controlled lab conditions
  • Serves as the ground-truth benchmark for calibrating objective quality prediction models (e.g., PESQ, POLQA)
  • Defines quality targets for codec and network performance in 3GPP specifications
  • Applicable to speech, audio, video, and audiovisual multimedia services

Evolution Across Releases

Rel-4 Initial

Formally introduced MOS into 3GPP specifications as a key performance metric for speech quality evaluation, particularly for the evolving 3G/UMTS services. It established the methodology for setting speech quality requirements based on MOS, providing a common framework to assess and compare the performance of circuit-switched and early packet-switched voice codecs and transport.

Defining Specifications

SpecificationTitle
TS 21.905 3GPP TS 21.905
TS 22.925 3GPP TS 22.925
TS 26.077 3GPP TS 26.077
TS 26.247 3GPP TS 26.247
TS 26.906 3GPP TS 26.906
TS 26.909 3GPP TS 26.909
TS 26.921 3GPP TS 26.921
TS 26.926 3GPP TS 26.926
TS 26.935 3GPP TS 26.935
TS 26.936 3GPP TS 26.936
TS 26.938 3GPP TS 26.938
TS 26.952 3GPP TS 26.952
TS 26.955 3GPP TS 26.955
TS 26.975 3GPP TS 26.975
TS 26.976 3GPP TS 26.976
TS 26.978 3GPP TS 26.978
TS 26.989 3GPP TS 26.989
TS 29.520 3GPP TS 29.520
TS 46.008 3GPP TR 46.008
TS 46.055 3GPP TR 46.055