MUSHRA

Multiple Stimulus with Hidden Reference and Anchors method

Services →
Introduced in Rel-8

MUSHRA is a standardized subjective audio quality assessment method where listeners rate multiple processed samples against a hidden reference and explicit anchor samples to evaluate speech and audio codec quality.

Category
Services
Introduced
Rel-8
Where
Services › Codecs
Specifications
4 specs
MUSHRA Description Purpose Related Specifications

Description

The Multiple Stimulus with Hidden Reference and Anchors (MUSHRA) method is a rigorous, controlled procedure for subjectively evaluating the perceptual quality of intermediate to high-quality audio codecs and processing systems. In a MUSHRA test, a panel of listeners with normal hearing is presented with a series of audio sequences. For each test item, the listener hears several versions (stimuli) of the same source audio: one is the hidden, unprocessed reference (the original high-quality signal), others are the codec/processing outputs under test, and included are explicit anchor stimuli—a high-quality anchor (e.g., a mild low-pass filter) and a low-quality anchor (e.g., a severe bandwidth limitation). All stimuli, including the reference, are presented in a randomized order and are labeled anonymously (e.g., A, B, C). The listener's task is to rate each stimulus on a continuous scale from 0 (bad) to 100 (excellent) relative to their perception of ideal quality. The hidden reference serves as an internal control to check listener reliability, while the anchors provide a fixed quality framework, ensuring scores are consistent across different tests and laboratories. The final result for a codec is the average score across all listeners and test items, providing a Mean Opinion Score (MOS) that reliably reflects its perceptual performance.

Purpose & Motivation

MUSHRA was developed to address the limitations of simpler listening test methods, like the Absolute Category Rating (ACR), which are inadequate for assessing high-quality audio where impairments are often subtle. Before MUSHRA, comparing advanced wideband or full-band codecs was challenging due to a lack of sensitivity and context in scoring. The method was created to provide a highly reliable and repeatable way to rank the performance of speech and audio codecs, such as EVS, AMR-WB, and 3GPP audio standards for multimedia services. It solves the problem of subjective bias by hiding the reference and including calibrated anchors, which stabilize the rating scale across different listener panels and test sessions. This is critical for 3GPP standardization, where objective metrics (like PESQ) are insufficient, and definitive, human-centric quality decisions are needed to select the best codec among competing proposals for inclusion in the specifications, ensuring optimal quality of experience for end-users.

Evolution Across Releases

Rel-8 Initial

Formally adopted and specified within 3GPP as the recommended method for subjective testing of wideband speech codecs and audio systems. Established the core test procedure, requirements for listeners, equipment, and environment, solidifying its role in codec qualification.

Explore further

Broader topics and technologies where MUSHRA plays a role.

Defining Specifications

3GPP specifications that define or reference MUSHRA, with the latest known release. Sourced from the 3GPP document catalog — see methodology.

SpecificationTitleRelease
TS 26.818 vf00 Audio Media Profiles Test Results for VR Streaming Rel-15
TR 26.936 vj00 Audio Codec Characterization Technical Report Rel-19
TR 26.950 vj00 Surround Sound in 3GPP Services Study Rel-19
TR 26.996 vj00 ISAR Split Rendering Audio Characterization Rel-19