SOLA

Synchronized Overlap-Add

Services →
Introduced in Rel-12

SOLA is a 3GPP-standardized digital signal processing algorithm for high-quality time-scale modification of audio, enabling features like adjustable voice message playback speed without pitch alteration.

Category
Services
Introduced
Rel-12
Where
Services › Codecs
Specifications
1 specs
SOLA Description Purpose Specifications

Description

Synchronized Overlap-Add (SOLA) is a sophisticated time-domain algorithm for time-scale modification (TSM) of audio signals. Its primary function is to change the duration (or playback rate) of an audio segment without affecting its perceptual pitch or tonal characteristics. This is distinct from simple resampling, which would change both speed and pitch. The SOLA algorithm works by decomposing the input audio signal into short, overlapping analysis frames. These frames are then repositioned along the time axis according to the desired speed change factor (α). If α > 1, the signal is sped up (compressed in time); if α < 1, it is slowed down (expanded in time).

The core innovation of SOLA lies in the 'synchronized' cross-correlation step performed during the overlap-add process. When two consecutive output frames are overlapped and added together to create a continuous output waveform, a direct overlap-add with fixed spacing can cause phase discontinuities, leading to audible distortion or 'clicks.' SOLA mitigates this by calculating the cross-correlation between the overlapping portions of the two frames. It then identifies the time lag (or shift) that maximizes their similarity. The second frame is shifted by this optimal lag before the overlap-add operation, effectively synchronizing the waveforms at their overlap point. This synchronization preserves the periodic structure of quasi-stationary signals like voiced speech, resulting in a smooth, high-quality output with minimal artifacts.

Within the 3GPP context, SOLA is specified in TS 26.448 for the Enhanced Voice Services (EVS) codec. It is a key component for implementing playback rate control for voice messages or recorded speech. The algorithm operates on the decoded audio signal, providing a flexible and efficient post-processing step. Its parameters, such as analysis frame length and overlap length, are optimized for speech signals to balance computational complexity with output quality. By standardizing SOLA, 3GPP ensures interoperable and high-quality time-scale modification across different devices and networks, enhancing the user experience for voice-based services.

Purpose & Motivation

SOLA was developed to solve the problem of changing speech playback speed in a natural and intelligible way. Prior to advanced TSM algorithms, simple methods like sample repetition or deletion caused severe perceptual artifacts, such as warbling, reverberation, or robotic sounds, especially for speech signals. The need for high-quality time-scale modification arose from practical user features, such as listening to voice messages at an accelerated rate to save time or slowing down a message for better comprehension, particularly in noisy environments or for language learners.

The adoption and standardization of SOLA within 3GPP were motivated by the evolution of voice services from simple calls to rich multimedia communication. Features like voice messaging, audio note playback, and real-time transcription support became important. SOLA provides a computationally efficient and high-quality solution suitable for implementation on mobile devices. It addresses the limitation of earlier overlap-add techniques by dynamically synchronizing waveform segments, which is crucial for maintaining the natural prosody and intelligibility of time-scaled speech. Its inclusion in the EVS codec specifications ensures that this enhanced functionality is widely available and consistently implemented, contributing to a richer voice service ecosystem.

Evolution Across Releases

Rel-12 Initial

SOLA was introduced and standardized in 3GPP within TS 26.448 as part of the Enhanced Voice Services (EVS) codec work item. This initial specification defined the algorithm's operation, parameters, and its application for playback rate control of speech, establishing a baseline for high-quality time-scale modification in mobile voice services.

Explore further

Broader topics and technologies where SOLA plays a role.

Defining Specifications

3GPP specifications that define or reference SOLA, with the latest known release. Sourced from the 3GPP document catalog — see methodology.

SpecificationTitleRelease
TS 26.448 vj00 EVS Jitter Buffer Management Specification Rel-19