Performance characterization of the GSM Enhanced Full Rate (EFR) speech codec

Specification: 46055

🟢Approvedv800
Rel-8
Relevance:7/10

Summary

This document reports the results from the Pre-selection and Verification Phase of testing of the Enhanced Full Rate speech codec.

Specification Intelligence

This is a Technical Document in the Unknown Series series, focusing on Technical Document. The document is currently in approved by tsg and under change control and is under formal change control.

Classification

Type: Technical Document
Subject: Unknown Series
Series: 46.xxx
Target: Technical Implementers

Specifics

Status: Change Control

Version

800.0.0
Release 800
0 technical • 0 editorial

Full Document v800

3GPP TR 46.055 v. 8.0.0

Technical Specification

3rd Generation Partnership Project;

Technical Specification Group Services and System Aspects;

Performance characterization of the GSM

Enhanced Full Rate (EFR) speech codec

(Release 8)

 

 

                                                                          

The present document has been developed within the 3rd Generation Partnership Project (3GPP TM) and may be further elaborated for the purposes of 3GPP.    
The present document has not been subject to any approval process by the 3GPP Organizational Partners and shall not be implemented. 
This Specification is provided for future development work within 3GPP only. The Organizational Partners accept no liability for any use of this Specification.
Specifications and reports for implementation of the 3GPP TM system should be obtained via the 3GPP Organizational Partners' Publications Offices.

 

 


 

Keywords

GSM, speech, codec

 

3GPP

Postal address

 

3GPP support office address

650 Route des Lucioles - Sophia Antipolis

Valbonne - FRANCE

Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16

Internet

http://www.3gpp.org

 

Copyright Notification

No part may be reproduced except as authorized by written permission.
The copyright and the foregoing restriction extend to reproduction in all media.

 

© 2008, 3GPP Organizational Partners (ARIB, ATIS, CCSA, ETSI, TTA, TTC).

All rights reserved.

 

UMTS™ is a Trade Mark of ETSI registered for the benefit of its members

3GPP™ is a Trade Mark of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners
LTE™ is a Trade Mark of ETSI currently being registered for the benefit of its Members and of the 3GPP Organizational Partners

GSM® and the GSM logo are registered and owned by the GSM Association

 


Contents

Foreword................................................................................................................................................ 4

Introduction............................................................................................................................................ 5

1....... Scope........................................................................................................................................... 7

2....... References.................................................................................................................................... 7

3....... Abbreviations................................................................................................................................ 7

4....... Quality under error (EP0 – EP3) and tandeming conditions (Exp Number 1 and Exp Number 5)........ 8

5....... Quality under background noise conditions (Exp Number 2 and Exp Number 3)............................... 9

6....... Talker dependency (Exp Number 4)............................................................................................... 9

7....... DTX system.................................................................................................................................. 9

7.1......... Channel activity in DTX mode......................................................................................................................................... 9

7.1.1........... Test procedure............................................................................................................................................................... 9

7.1.2........... Speech channel activity............................................................................................................................................... 9

7.1.3........... Level compensation...................................................................................................................................................... 9

7.1.4........... Interleaving compensation........................................................................................................................................ 10

7.1.5........... Estimated mean TDMA channel activity............................................................................................................... 10

7.2......... DTX/CNI Informal Expert Listening tests................................................................................................................... 10

7.2.1........... Introduction................................................................................................................................................................. 10

7.2.2........... Test environment........................................................................................................................................................ 10

7.2.3........... Results.......................................................................................................................................................................... 10

8....... Performance with DTMF tones.................................................................................................... 10

8.1......... Introduction........................................................................................................................................................................ 10

8.2......... Test environment.............................................................................................................................................................. 11

8.3......... Results................................................................................................................................................................................. 11

9....... Network information tones........................................................................................................... 12

10..... Performance with special input signals.......................................................................................... 12

10.1....... Music signals..................................................................................................................................................................... 12

10.2....... Noise signals...................................................................................................................................................................... 13

11..... Performance with different languages........................................................................................... 13

12..... Delay.......................................................................................................................................... 14

13..... Frequency response..................................................................................................................... 17

13.1....... Introduction........................................................................................................................................................................ 17

13.2....... Test environment.............................................................................................................................................................. 17

13.3....... Results................................................................................................................................................................................. 17

14..... Complexity................................................................................................................................. 18

15..... Summary of the results from the subjective testing........................................................................ 19

Annex A:........ Summary of results (lab by lab)................................................................................... 21

A.1... Quality under Error and tandeming conditions............................................................................... 21

A.2... Quality under Background noise conditions.................................................................................. 23

A.3... Quality for Talker Dependency (DMOS and SD)........................................................................... 24

Annex B:........ Change history............................................................................................................ 25

 


This Technical Specification has been produced by the 3rd Generation Partnership Project (3GPP).

The contents of the present document are subject to continuing work within the TSG and may change following formal TSG approval. Should the TSG modify the contents of the present document, it will be re-released by the TSG with an identifying change of release date and an increase in version number as follows:

Version x.y.z

where:

x    the first digit:

1    presented to TSG for information;

2    presented to TSG for approval;

3    or greater indicates TSG approved document under change control.

y    the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, updates, etc.

z    the third digit is incremented when editorial only changes have been incorporated in the document.


The SMG2-Speech experts Group (SEG) started its activity early in 1995 for the standardization of an Enhanced Full Rate speech codec. The Group produced a test plan for the first phase of testing (pre-selection phase) which is described in permanent document SEG‑4 (ETSI SMG2 SEG: SEG‑4 (v 1.0) "A Subjective Pre-Selection Test Plan for the Enhanced Full Rate Speech Coding Algorithm") to assess the performance of the submitted candidates. This test plan is based on the general knowledge coming from past ITU‑T and ETSI activities on codec evaluation (GSM half rate and ITU‑T 8 kbit/s recent exercises for instance). At the end of this Pre-selection Phase, SMG decided to standardize the PCS 1 900 codec, known as the US‑1 codec and no formal characterisation testing has been performed for the selected codec.

The present document therefore reports the results from the Pre-selection and Verification Phase of testing only. Consequently, the results reported here are less detailed, and the confidence intervals for them are wider, than those obtained for the GSM half rate standardization (GSM 06.08, [3]) where specific and detailed characterisation testing was performed. In addition, not all laboratories followed the same pre-selection test plan, further complicating the interpretation of the results.

The following experiments included in SEG‑4 were carried out by several laboratories in the Pre-selection Phase:

-     Experiment 1: Quality under error and tandeming conditions (A-law, Modified IRS);

-     Experiment 2: Quality under background noise conditions (Vehicular noise, UPCM, NoIRS);

-     Experiment 3: Quality under background noise conditions (Background music, UPCM, NoIRS);

-     Experiment 4: Talker Dependency (UPCM, NoIRS);

-     Experiment 5: Quality under high error conditions –EP3 (A-law, Modified IRS).

A practical 'indirect' method of performance comparison between different results was adopted utilising the Modulated Noise Reference Unit (MNRU) (see note) as a reference degradation. The MNRU provides the additional function of allowing normalisation of results across different laboratories carrying out the same experiment, through the conversion of MOS scores to Equivalent Q (dB). The Q (dB) values introduced in a test normally range from 0 to 50 dB. In SEG‑4, both Experiment#1 and Experiment#5 on error conditions covers this range, the other experiments do not.

NOTE:      The MNRU is a device designed for producing speech correlated noise that sounds subjectively like the quantising noise produced by log-companded PCM codecs. The device is subjectively calibrated for Mean Opinion Scores (MOS) against Q dB (where Q is the ratio of the speech to speech-correlated noise power). The 'Equivalent Q' of the codecs under test can be found from the corresponding MOS on the calibration curve of the MNRU (S-shaped curve).

Only four laboratories ran tests which followed the Pre-selection Test Plan described in SEG‑4 (BT/lab1, CNET/lab2, Tele Denmark/lab3, NEC/lab4). MOTOROLA/lab5 participated in the Pre-selection Phase but their experiments did not comply with SEG‑4. TI/lab8 ran one experiment only from SEG‑4. Results produced by COMSAT/lab6 following a NOKIA-designed test plan are part of standardization of the codec in North America and NOKIA/lab7 performed complementary experiments during the ETSI Pre-selection Phase.

As no further analysis have been undertaken to allow the averaging of scores across the different laboratories, results are reported in the annex on a laboratory-by-laboratory basis. For error and tandeming conditions, results are reported in terms of Equivalent Q (dB) values. For background noise conditions and talker dependency, results are reported in terms of DMOS values with either Confidence Interval (CI) or Standard Deviation (SD) as there is insufficient data available to normalise across laboratories via MNRU conditions.

The quality performance of the EFR codec is compared to High and Low references introduced in permanent documents SEG‑3 (ETSI SMG2 SEG: SEG‑3 "Selection Criteria for the Enhanced Full Rate Speech Coding Algorithm – Speech Quality Requirements") and SEG‑4 (ETSI SMG2 SEG: SEG‑4 (v 1.0) "A Subjective Pre-Selection Test Plan for the Enhanced Full Rate Speech Coding Algorithm", Section 7). These references were chosen as representative of the "minimum" and "objective" performance targets respectively, and are reported in table 1.

Table 1: References per condition: High Ref., Low Ref. And G.728

EXPERIMENTS

(SEG‑4)

Conditions

High Ref

Low Ref

EXP#1

EP0

G.728

G.728

EXP#1

EP1

MNRU 24 dB

TCH-FS (EP1)

EXP#1

EP2

TCH-FS (EP1)

TCH-FS (EP2)

EXP#5

EP3

TCH-FS (EP2)

TCH-FS (EP3)

EXP#1

EP0 (tandem)

G.728

G.728

EXP#1

EP1 (tandem)

TCH-FS (EP1)

TCH-FS (EP1 tandem)

EXP#2

Vehicle 10

G.728

G.728

EXP#3

Music 20

G.728

G.728

EXP#4

Male Talkers

G.728

G.728

EXP#4

Female Talkers

G.728

G.728

EXP#4

Children

G.728

G.728

 

A figure showing the general trend of the EFR behaviour for error conditions in noise-free environment, compared to the high (G.728) and low (TCH-FS) references is added to individual laboratories' quantitative results (figure 15). The general quality performance of the EFR codec is summarised in table 15.

In the Verification Phase, the behaviour of the EFR codec under the following test conditions was tested:

-     behaviour of the DTX System;

-     performance with DTMF tones;

-     performance with network information tones;

-     performance with special input signals;

-     performance with music signals;

-     performance with noise signals;

-     performance with different languages;

-     delay of the TCH-EFR;

-     frequency response;

-     complexity.

The results of these tests are also included in this report under the respective clauses.

Furthermore, the EFR codec was checked for correct functioning for the following items:

-     test of overload point;

-     SID frame encoding;

-     muting behaviour;

-     idle channel behaviour.

No artefact or malfunctioning was detected for these items.


The present document gives background information on the performance of the GSM enhanced full rate speech codec. Experimental results from the Pre-selection and Verification tests carried out during the standardization process by the SEG (Speech Expert Group) are reported to give a more detailed picture of the behaviour of the GSM enhanced full rate speech codec under different conditions of operation.

The following documents contain provisions which, through reference in this text, constitute provisions of the present document.

·       References are either specific (identified by date of publication, edition number, version number, etc.) or non‑specific.

·       For a specific reference, subsequent revisions do not apply.

For a non-specific reference, the latest version applies.  In the case of a reference to a 3GPP document (including a GSM document), a non-specific reference implicitly refers to the latest version of that document in the same Release as the present document.

[1]                          GSM 03.05: "Digital cellular telecommunications system (Phase 2+); Technical performance objectives".

[2]                          GSM 03.50: "Digital cellular telecommunications system (Phase 2+); Transmission planning aspects of the speech service in the GSM Public Land Mobile Network (PLMN) system".

[3]                          GSM 06.08: "Digital cellular telecommunications system (Phase 2+); Half rate speech; Performance of the GSM half rate speech codec".

[4]                          GSM 06.10: "Digital cellular telecommunications system (Phase 2+); Full rate speech transcoding".

[5]                          GSM 06.20: "Digital cellular telecommunications system (Phase 2+); Half rate speech transcoding".

For the purposes of the present document, the following abbreviations apply:

A/D                        Analogue to Digital

ADPCM                Adaptive Differential Pulse Code Modulation

ACR                      Absolute Category Rating

BSC                       Base Station Controller

BTS                       Base Transceiver Station

C/I                         Carrier-to-Interferer ratio

CI                           Confidence Interval

CNI                        Comfort Noise Insertion

CRC                      Cyclic Redundancy Check

D/A                        Digital to Analogue

DAT                       Digital Audio Tape

DCR                      Degradation Category Rating

DSP                        Digital Signal Processor

DTMF                   Dual Tone Multi Frequency

DTX                       Discontinuous Transmission for power consumption and interference reduction

EFR                       Enhanced Full Rate

ESP                        Product of E (Efficiency), S (Speed) and P (Percentage of Power) of the DSP

FR                          Full Rate

GBER                    Average gross bit error rate

GSM                      Global System for Mobile communications

HR                         Half Rate

IRS                        Intermediate Reference System, No IRS= rather flat

ITU‑T                    International Telecommunication Union – Telecommunications Standardization Sector

MNRU                  Modulated Noise Reference Unit

Mod. IRS              Modified IRS

MOPS                    Million of Operation per Seconds

MOS                      Mean Opinion Score

MS                         Mobile Station

MSC                      Mobile Switching Centre

PCM                      Pulse Code Modulation

PSTN                     Public Switched Telecommunications Network

Q                            Speech-to-speech correlated noise power ratio in dB

SD                          Standard Deviation

SEG                       Speech Expert Group

SID                        Silence Descriptor

SMG                      Special Mobile Group

TCH-EFS              Traffic Channel Enhanced Full rate Speech

TCH-FS                Traffic Channel Full rate Speech

TCH-HS               Traffic Channel Half rate Speech

TDMA                   Time Division Multiple Access

TMOPS                 True Million of Operation per Seconds

UPCM                   Uniform or Linear PCM

VAD                      Voice Activity Detector

WMOPS                Weighted Million of Operations per Seconds

Four different Error Patterns (EP0, EP1, EP2 and EP3) were used, where:

EP0                        without channel errors

EP1                        C/I=10 dB;     5% GBER (well inside a cell)

EP2                        C/I= 7 dB;      8% GBER (at a cell boundary)

EP3                        C/I= 4 dB;      13% GBER (outside a cell)

A listening-only test was adopted using the Absolute Category Rating (ACR) method. The results are reported in terms of Equivalent Q (dB) values and Differential Q values (which compare the codec results to the High and Low references). For error and tandeming conditions, results are available from eight laboratories (lab1 to lab8). Tables of results on a lab-by-lab basis are shown in the annex of the present document (table A.1.1 to table A.1.8), negative values indicating worse performance than the reference.

In general, across all laboratories, the EFR codec performs better than the reference TCH-FS for clear speech (EP0), for error conditions EP1 and EP2 and for tandeming under error EP1 conditions. For severe error condition (EP3), the performance is worse than TCH-FS in one laboratory. The EFR is equivalent to the reference G.728 (high reference) for clear speech in all laboratories. Under error conditions, the high reference threshold for severe error condition (EP3) is not met in all laboratories while the threshold for EP1 and EP2 is met for, roughly, half of the laboratories. Under tandeming, the clear condition was tested in only one laboratory where it was compared to another standard G.721; the results indicate that the performance of the EFR (EP0 tandem) is equivalent to that of G.721 (EP0). For tandeming under error condition EP1, equivalence with TCH-FS (EP1) without tandeming is demonstrated in all laboratories except one. Additional results coming from one lab only can be found in table A.1.6 (effect of input levels, other error conditions, tandeming with other standards).

The advantage of the EFR compared to the actual TCH-FS is not independent of the quality of the network. As channel errors increase, this advantage is reduced. The general trend of the EFR behaviour in error conditions is shown in figure 15.

This was assessed with a listening-only test, using the Degradation Category Rating (DCR) method. The results are reported for the EFR codec, the Reference G.728 and the TCH-FS codec in terms of DMOS values with Confidence Interval (CI). Six laboratories (lab1 to lab4, lab6 and lab7) performed this experiment, the first four complying with SEG‑4 (see table A.2.1 and table A.2.2).

For each laboratory, the differences in DMOS scores between the EFR codec and the Reference G.728 are of the same order as the confidence intervals for the EFR codec results, with the exception of one point (vehicle noise) in one laboratory. From this, it can be concluded that the performance of the EFR codec, under the background noise conditions tested is equivalent to that of the quality reference G.728 for all laboratories and also to G.721 (tested in one lab only). The degradation introduced by the EFR codec compared to the DIRECT connection in background noise conditions is rated between "unnoticeable" and "noticeable but not annoying". A substantial improvement is achieved over the full rate with music in the background. Additional results from one laboratory can be found in table A.2.2.

A listening-only test was used with the Degradation Category Rating (DCR) methodology. Results, available from five laboratories (lab1 to lab4 and lab7), are reported in terms of DMOS values with their associated Standard Deviation (SD) to give a measure of the spread of the scores about the averages for each gender for both the EFR codec and the Reference G.728. These experiments clearly show that the standard deviation of the scores of EFR codec for each gender is smaller than the standard deviation of the reference G.728 in each laboratory. The talker dependency performance for the EFR codec is therefore equivalent to that of G.728. Also, the gender dependency is equivalent to that of the G.728 codec. Tables of results lab-by-lab are shown in the annex (table A.3.1 to table A.3.2).

7.1        Channel activity in DTX mode

7.1.1       Test procedure

A carefully selected subset of the speech material recorded for testing the half rate DTX system was processed through the codec/DTX C-language simulation. This material comprised 48 real conversations in the English, German and Italian languages. The channel activity of the system was measured for all 48 conversations, and the mean channel activity was then calculated.

7.1.2       Speech channel activity

The percentage of speech frames scheduled for transmission by the radio subsystem (subsequently referred to as the speech channel activity) varied significantly between conversations. Speech channel activities ranged from 29% to 93% for individual sides of a conversation. For this reason, it was not possible to identify any significant trends in the results with regard to terminal type and environmental conditions. The mean speech channel activity, measured over all 48 conversations, was 61 %.

7.1.3       Level compensation

After calculating the mean speech channel activity, it was found that the speech material had been processed at a level 6,5 dB below the original recorded level. However, the activity of the basic VAD algorithm rises approximately 0,5 per cent per dB increase in input level. To compensate for this, a factor of 3 % must be added to the speech channel activity estimate.

7.1.4       Interleaving compensation

The channel activity measurements were calculated on a signal frame basis. However, the use of interleaving (depth 4) implies that the TDMA activity will be approximately 2 % higher than the signal frame activity.

7.1.5       Estimated mean TDMA channel activity

The estimated mean TDMA channel activity is shown in table 7.1.5.1.

Table 7.1.5.1: Calculation of mean TDMA channel activity

speech channel activity

61 %

level compensation

3 %

interleaving compensation

2 %

total TDMA channel activity

66 %

 

7.2        DTX/CNI Informal Expert Listening tests

7.2.1       Introduction

To check the performance of the DTX / CNI system of the ETSI GSM EFR codec, informal expert listening tests were done in Italian and German language. Also a very brief check of English speech samples was done. Special attention was given to clipping effects and noise.

7.2.2       Test environment

Out of the speech samples from the HR codec DTX tests, 8 conversations were selected by CSELT, Deutsche Telekom and British Telecom, respectively. These samples were processed by Nokia and recorded on a DAT, one track without VAD/DTX processing and one track with the DTX / CNI system. By comparing the non-DTX and DTX speech, the listeners could judge the degradation to be not noticeable, minor, moderate or severe. It was allowed to rewind the tape to repeat listening to critical sections. The listening device was a high quality head set in mono operation to have either track 0 or track 1 signal on both speakers.

7.2.3       Results

In all the speech samples, only two clippings were judged to be noticeable. On comfort noise insertion, conversations with almost no or low background noise were found to have no noticeable degradation. With increasing background noise, the noise related degradation was judged from minor to moderate (the latter in two sections of two conversations). The overall performance of the DTX / CNI system was seen to be fully satisfactory with mostly no or minor degradation.

8.1        Introduction

A desirable requirement for the GSM Enhanced Full Rate speech codec is a DTMF transparency not worse than the GSM Full Rate codec. For the verification of the ETSI Enhanced Full Rate codec, the DTMF transmission was tested.

8.2        Test environment

A DSP (NEC µPD77016) based PC board was used to measure the transmission of the codec under test. The DTMF software is derived from the Goertzel algorithm which allows to calculate the spectral powers of distinctive frequencies by means of a recursive digital filter scheme. The DTMF signal detection is based on "quality factors" calculated from the Distinctive Frequency Test results. Within a wide dynamic range this technique is independent from an absolute signal level. Based on the same hardware and software, PTT approvals are available with equipment of European Telecom houses.

DTMF signals were tested only under ideal transmission conditions. Error patterns like in the half rate case were not simulated. In the different experiments the input signals were modified in tone and pause length, amplitude (also introducing twist, i.e. different amplitude in the two components of the tone) and frequency. In all experiments 10 tones were input to the codec. The resulting files were processed by the DTMF detector. As the minimum tone length specified for an input signal of a detector is 80 ms while the minimum output length of a DTMF generator may be smaller, a test was also done with a 60 ms tone to the codec.

8.3        Results

The test results shown in table 8.3.1 represent the detected tones from the 10 input signals. Table 8.3.2 summarises the test conditions. With input signals fully in the specified range no detection problems were observed. The shortest allowed input signal to a transmission line (80 ms) was detected 100 % in all experiments with different input levels, twist and frequency deviations. A strange effect known from the HR codec tests with long tones detected as two tones was not observed. Only in case of tones shorter than 80 ms the detection rate was down to 96 %, without a sharp decrease and without a distinct tone showing problems.

As a conclusion, the codec is tested to be 100 % transparent to DTMF signals under nominal conditions. Only tones shorter than minimum input specifications of 80 ms are not fully detected. The results are better compared to the FR codec. The requirement is fulfilled.

Table 8.3.1: Results of DTMF experiments

experiment

tone

N18

N22

N18‑22

N22‑26

D18

D18‑22

L 120

L 200

S 60

 1

10

10

10

10

10

10

10

10

10

 2

10

10

10

10

10

10

10

10

9

 3

10

10

10

10

10

10

10

10

10

 4

10

10

10

10

10

10

10

10

9

 5

10

10

10

10

10

10

10

10

10

 6

10

10

10

10

10

10

10

10

10

 7

10

10

10

10

10

10

10

10

10

 8

10

10

10

10

10

10

10

10

10

 9

10

10

10

10

10

10

10

10

10

 0

10

10

10

10

10

10

10

10

10

 *

10

10

10

10

10

10

10

10

10

 #

10

10

10

10

10

10

10

10

9

 A

10

10

10

10

10

10

10

10

10

 B

10

10

10

10

10

10

10

10

9

 C

10

10

10

10

10

10

10

10

9

 D

10

10

10

10

10

10

10

10

10

total_d

160

160

160

160

160

160

160

160

155

det_rate

100

100

100

100

100

100

100

100

96

 

In rows 1 – D the number of detected tones from 10 inputs is shown

Table 8.3.2: Conditions of above listed experiments

experiment

N18

N22

N18‑22

N22‑26

D18

D18‑22

L 120

L 200

S 60

 

tone

80

80

80

80

80

80

120

200

60

ms

pause

80

80

80

80

80

80

120

80

60

ms

r_amp

‑18

‑22

‑18

‑22

‑22

‑18

‑22

‑28

‑22

dB

c_amp

‑18

‑22

‑22

‑26

‑22

‑22

‑22

‑28

‑22

dB

delta_f

0

0

0

0

2

2

0

0

0

%

 

r_amp and c_amp are the row amplitude and column amplitude respectively, dB values are relative to the overload point.

The signals shown in table 9 were first compressed by the encoder, then decompressed by the decoder, and then listened to via quality headphones using a high-quality PC audio card. The codec showed no perceivable degradation to the transmission of these PSTN network information tones both with and without the VAD/DTX system switched on. No clipping or other disturbing artefacts were noticed when DTX was enabled. Checking tones in use around the world as listed in ITU Recommendation E.180 Supplement 2 (Jan 94) indicated that this test achieves almost 100 % global coverage by simply testing UK, German, and USA tones.

Table 9: PSTN Information Tones Tested

German (no DTX)

German (with DTX)

3 dial tones

2 dial

1 ringing tone

1 ringing

2 busy tones (subscriber engaged)

2 busy

1 special information tone (number unobtainable)

1 special information tone

2 congestion tones (network equipment engaged)

1 fax modem call setup tone sequence

United Kingdom (no DTX)

United Kingdom (with DTX)

3 dial tones

1 dial

1 ringing tone

1 ring

1 busy tone (subscriber engaged)

1 busy

1 congestion tone (network equipment engaged)

1 congest

-

1 sustained, low-level sinusoid (number unobtainable)

USA (no DTX)

USA (with DTX … not tested)

1 dial tone

-

1 ringing tone

-

1 busy tone (subscriber engaged)

-

1 special information tone

-

1 congestion tone (network equipment engaged)

-

 

Tones were computer generated for the tests in which DTX was switched off. Authentic DAT recordings of PSTN information tones were used to check the performance with DTX switched on, except the low-level sinusoid signal for "UK number unobtainable" which was computer generated.

Two kinds of special input signals have been chosen to be tested in the verification phase of the Enhanced Full Rate: music signals and noise signals.

10.1      Music signals

This subclause reports on the informal listening tests conducted in CSELT to evaluate the performance of the EFR codec with music signals.

The tests have been based on informal pair comparison tests (A versus B without repetition) by considering the Full-rate codec, the Enhanced Full-Rate as well as the ITU‑T ADPCM G.726 codec at 32 kbit/s. The tests involved 6 music items taken from those selected by ISO-MPEG to test audio codec standards. The duration of the different music items lasts in the range between 8 and 10 seconds. Music items have been downsampled to 8 kHz before processing. Listening was performed by 12 naive listeners through headphones.

The results are reported in tables 10.1.1 and 10.1.2.

Table 10.1.1: Results of the informal test on performance
with music signals: Enhanced Full Rate versus Full Rate

Music Items

Enhanced Full Rate

preferred to Full Rate

Enhanced Full Rate

equal to Full Rate

Full Rate preferred to

Enhanced Full Rate

Harpsichord

100 %

0 %

0 %

Carmen

25 %

41,7 %

33,3 %

Trumpet

100 %

0 %

0 %

Castanets

33,3 %

41,7 %

25 %

Mediterraneo

41,7 %

33.3 %

25 %

Vivaldi "The spring"

100 %

0 %

0 %

Total

66,7 %

19,4 %

13,9 %

 

Table 10.1.2: Results of the informal test on performance
with music signals: Enhanced Full Rate versus ADPCM 32 kbit/s

Music Items

Enhanced Full Rate

preferred to ADPCM

Enhanced Full Rate

equal to ADPCM

ADPCM preferred to

Enhanced Full Rate

Harpsichord

50 %

8,3 %

41,7 %

Carmen

0 %

25 %

75 %

Trumpet

33,3 %

33,3 %

33,3 %

Castanets

8,3 %

41,7 %

50 %

Mediterraneo

16,7 %

25 %

58,3 %

Vivaldi "The spring"

16,7 %

25 %

58,3 %

Total

20,9 %

26,4 %

52,7 %

 

The analysis of results shows a certain dependency of performance on the music item. There is at least one item in which the FR has been judged better than the EFR. Nevertheless, on the average, the EFR provides better performance than the FR, whilst it appears to perform worse than the ADPCM.

10.2      Noise signals

To check the scaling performance of the fixed point algorithm a noise signal with levels ranging from -10 dB down to ‑70 dB have been processed by encoder and decoder in error free conditions. The level of the decoder output signal was examined. It was found that for all signals the reconstructed output level followed the input level. Even for very low signal levels no problems were detected.

This clause deals with the results of an informal listening test to evaluate the performance of the EFR for some languages which were not tested formally.

The tests have been based on informal pair comparison (A versus B without repetition) by considering the Full-rate codec, the Enhanced Full-Rate as well as the ITU‑T ADPCM G.726 codec at 32 kbit/s. The tests involved 5 different languages (Arab, Chinese, Japanese, Polish and Portuguese). Listening and recording was performed by naive, mother tongue people. For most languages, however, it was possible to use only one listener thus suggesting to take the results with the due caution.

The test was performed by collecting people of different mother-tongue at CSELT premises. Subjects were asked to record a list of sentences in their own languages. The sentence length was in the range from 4 to 6 seconds. The list of languages, number of listeners and samples is reported in table 11.1.

Table 11.1: List of languages and the number of listeners and sentences used

LANGUAGE

NUMBER OF LISTENERS

NUMBER OF SENTENCES

Arab

2

8

Chinese

1

8

Japanese

1

8

Polish

2

8

Portuguese

1

8

 

The subject were asked to listen to an A-B sequence and allowed to express a preference as well as to judge the perceived quality to be the same. The results of the test are reported in tables 11.2 and 11.3.

Table 11.2: Results of the informal test on languages not covered
in the formal tests: Enhanced Full Rate versus Full Rate

Languages

Enhanced Full Rate

preferred to Full Rate

Enhanced Full Rate equal to Full Rate

Full Rate preferred to

Enhanced Full Rate

Arab

37,5 %

50 %

12,5 %

Chinese

100 %

0 %

0 %

Japanese

100 %

0 %

0 %

Polish

68,7 %

12,5 %

18,8 %

Portuguese

75 %

25 %

0 %

 

Table 11.3: Results of the informal test on languages not covered
in the formal tests: Enhanced Full Rate versus ADPCM at 32 kbit/s

Languages

Enhanced Full Rate

preferred to ADPCM

Enhanced Full Rate

equal to ADPCM

ADPCM preferred to

Enhanced Full Rate

Arab

18,75 %

75 %

6,25 %

Chinese

87,5 %

12,5 %

0 %

Japanese

87,5 %

12,5 %

0 %

Polish

25 %

37,5 %

37,5 %

Portuguese

12,5 %

50 %

37,5 %

 

The analysis of the results confirms the good performance of the Enhanced full-rate also for languages not considered in the formal experiments.

This seems to be the case for all the languages tested, even though the test size was very small. The EFR was always preferred in comparison to the Full-rate. For Chinese and Japanese the preference is stronger and, for these languages, the EFR is preferred also to the ADPCM at 32 kbit/s in most of the cases.

The round-trip delay of a communication using a TCH-EFS has been estimated taking into account all the system and processing delays.

The symbol definitions for the calculations in this section are:

Tabisd                   The time required to transmit the 260 speech frame data bits (bits D1 – D260, C16 and the 17 synchronization bits -> 278 bits) over the 16 kbit/s A‑bis‑interface in the downlink direction (system dependent).

Tabisu                   The time required to transmit the first 137 TRAU frame bits, the first 34 of which can be sent by anticicipation, leading to a delay of 103 TRAU frame bits (D2 – D98 speech frame data bits including the CRCs + 6 synchronization bits) over the 16 kbit/s A‑bis‑interface in the uplink direction (system dependent).

Tad                        Delay in the analogue to digital converter in the uplink.

Tbsc                       Switching delay in the BSC (implementation dependent).

Tbuff                     Due to the time alignment procedure for inband control of the remote transcoder in case of a 16 kbit/s A‑bis‑interface in the downlink direction, it is required to have a buffer in the BTS of 1 ms + one 250 s regulation step (system dependent).

Tda                        Delay in the digital to analogue converter in the downlink.

Techo                    Delay due to the echo canceller.

Tencode:               The time required for the channel encoder to perform channel encoding (implementation dependent).

Tmsc                      Switching delay in the MSC.

Tpcm                     The duration of a segment of PCM speech for the downlink processing delay.

Tproc:                    The time required after reception of the last encoded speech parameter of the first subframe (FCB‑Gain1) to process the speech encoded data for the enhanced full rate speech decoder and to produce the first PCM output sample (implementation dependent).

Trftx:                     The time required for transmission of a TCH radio interface frame over the air interface due to the interleaving and de-interleaving (system dependent).

Trxproc:                The time required after reception over the radio interface to perform equalization, channel decoding and SID-frame detection (implementation dependent).

Tsample:               The duration of the segment of PCM speech operated on by the speech transcoder.

Tsps                       Delay of the speech encoder in the BSC after reception of the last PCM sample until availability of the first encoded bit (implementation dependent).

Ttransc:                 The MS speech encoder processing time, from input of the last PCM sample to output of the final encoded bit (implementation dependent).

The processing delays were estimated from the detailed complexity figure that has been previously computed in the verification phase. The complexity estimation is based on rules that are supposed to be relevant from an implementation point of view and independent from specific DSPs at the same time. Therefore it was tried to follow the same philosophy for the processing delays. The DSP that runs the codec has been modelled through three parameters E, S and P.

E stands for the Efficiency of the DSP. This corresponds to the ratio TMOPS/WMOPS of the implementation of the codec on the DSP.

S stands for the Speed of the DSP: Maximum Number of Operations that the DSP can run in 1 second. This number is expressed in MOPS.

P stands for the percentage of DSP processing power assigned to the codec.

The processing delay of a task whose complexity is X can then be computed using the formula:

                               D = X*20/ESP,

the time unit being ms.

The following assumptions were made when computing the round-trip delay:

-     for the enhanced Full Rate MS delay, it is assumed that the DSP has the same performance as the DSP used for GSM HR [5];

-     for the Enhanced Full Rate BSC delays, it is assumed that the DSP of the TRAU will have the same performance as the DSP used for GSM HR;

-     for the Enhanced Full Rate BTS delay, it is assumed that the DSP will have the same performance as the DSP used for GSM FR [4]. The reason is that it is assumed that the GSM Full Rate BTS will be reused during first GSM EFR deployments;

-     a 16 kbit/s submultiplexed A-bis is used between the BTS and the BSC-TRAU.

From these assumptions and following the complexity of GSM HR [3] and its delay requirement for the MS [2], the ESP value has been computed for EFR:

ESP = 25

The following list of delays provided in [1] and [2] for the GSM Full Rate and common to the GSM Enhanced Full rate are considered realistic and therefore retain the same value:

-     MSC    Tmsc

             margin

-     BSC     Tbsc

             margin

-     BTS     Trxproc

             margin

-     MS       Trftx

             Tda

The results of the estimation are provided in table 12.1 for uplink and table 12.2 for downlink. The time unit for all delays is ms (10-3 s).

Table 12.1: Uplink delay

Equipment

Speed Parameter

Delay (ms)

Data

MSC

Tmsc

0,5

 

 

margin

0,5

 

BSC

Tbsc

0,5

 

 

Tproc

1,27

1,59 WMOPS

 

margin

0,5

 

BTS

Tabisu

6,4375

103 bits

 

Trxproc

8,8

2,45 WMOPS (note)

 

margin

3

 

MS

Trftx

37,5

 

 

Tencode

0,32

0,20 WMOPS

 

Ttransc

12,17

15,21 WMOPS

 

Tsample

20

 

 

Tmargin

2

 

 

Tad

1

 

SUM

Uplink

94,4975

 

NOTE:       This theoretical complexity corresponds to the channel decoding only. This leaves 6,84 ms for the equaliser in Trxproc.

 

Table 12.2: Downlink delay

Equipment

Speed Parameter

Delay (ms)

Data

MSC

Techo

1

 

 

Tmsc

0,5

 

 

margin

0,5

 

BSC

Tbsc

0,5

 

 

Tsample

20

 

 

Tsps

2,3

 

 

Tabisd

17,375

278 bits

 

margin

0,5

 

BTS

Tbuff

1,25

 

 

Tencode

1,60

0,20 WMOPS

 

margin

0,45

 

MS

Trftx

37,5

 

 

Trxproc

8,8

2,45 WMOPS (note)

 

Tproc

1,27

1,59 WMOPS

 

margin

2

 

 

Tda

1

 

SUM

Downlink

96,547

 

NOTE:       This theoretical complexity corresponds to the channel decoding only. This leaves 6,84 ms for the equaliser in Trxproc.

 

Round-trip delay = Uplink delay + Downlink delay = 191,04 ms

This delay is very close to the delay indicated in [1], [2] and [3] for GSM Full Rate: 188,5 ms. The difference should be unnoticeable.

13.1      Introduction

A characteristic test in the verification of GSM speech codecs is the frequency response test. Sine tones in the telephony frequency band are input to the codecs, and after decoding the gain is calculated. It has to be pointed out that the frequency response measurement is given just as a piece of additional information and does not add information on the actual behaviour of the codec in terms of perceived quality or DTMF transparency.

13.2      Test environment

The tones were calculated to a nominal level of 22 dB below the overload point. Tones ranging from 80 Hz to 3 600 Hz in steps of 21 Hz with a nominal length of 2 s were input to the codec under test. After decoding the gain was calculated with averaged results of 400 ms intervals and again averaged for the total duration of one frequency to get the frequency response curve. This was done to check the transition behaviour of the codec and eventually disregard the first samples.

13.3      Results

Within the telephony band the frequency response is very flat. No abnormal deviations were observed. Also additional experiments with different input level (‑18 dB, ‑28 dB), or different tone length (500 ms, 4 s) resulted in almost identical curves. The decreasing gain above 3 kHz is relative small and far away from a 3 dB margin. The transition behaviour was very good.

Figure 13: GSM EFR codec frequency response at different input levels

The complexity of the Enhanced Full Rate is characterised by the 3 following items:

-     the number of cycles;

-     the data memory size;

-     the program memory size.

The values of these different figures depend on a specific DSP implementation. Nevertheless, the results obtained by the C description analysis can be used as references.

The speech transcoding functions are specified using a set of basic arithmetic operations. The WMOPS figure quoted is a weighted sum of the operations required to perform transcoding. The weight assigned to each operation is representative of the number of instruction cycles required to perform that operation on a typical DSP device.

The complexity range of the Enhanced Full Rate is equivalent to the Half Rate codec complexity.

The number of cycles required by the Enhanced Full Rate algorithm is relatively independent on the values of the input samples. The execution time of an average and an extreme input case are equivalent.

Nevertheless the following table presents the theoretical worst case evaluation, i.e. the maximum possible number of cycles, which is consistent with the results indicated in [3].

The following figures are associated to the Speech and Channel part excluding the DTX functions.

Table 14: Principal complexity figure

 

Theoretical
worst case
WMOPS

Data RAM
(note)

(16 bits words)

Data ROM (constants)

(16 bits words)

Program ROM (assembly instructions)

Enhanced Full Rate

18,1

4 708

5 363

6 000 – 9 000

Half Rate

21,2

5 002

8 781

8 000 – 12 000

 

NOTE:      The Data RAM figure can be split in 2 parts: the static variables: 2 240 words; and the dynamic variables (i.e. local to a procedure ): 2 468 words.

The EFR codec is better than the actual FR codec for clear speech, for all error conditions (EP1, EP2 and EP3) and for tandeming under error EP1; it is equivalent to G.728 for its intrinsic quality, for background noise conditions and talker dependency. The EFR codec does not reach the objective performance target (TCH-FS EP2) for severe error condition EP3; for error conditions EP1 and EP2 it does not reach the objective performance target for half of the results. The EFR quality under tandeming condition without error was not tested against the target G.728 but is found equivalent to G.721. The advantage of the EFR compared to the TCH-FS is dependant of the quality of the network. As channel errors increase, this advantage is reduced.

Table 15: Summary of Results

Conditions

High Ref

Low Ref

EP0

Equivalent to G.728

Equivalent to G.728

Better than TCH-FS

EP1

Worse than MNRU 24 dB

for half of labs

Better than TCH-FS (EP1)

EP2

Worse than TCH-FS (EP1)

for half of labs

Better than TCH-FS (EP2)

EP3

Worse than TCH-FS (EP2)

 

Better than TCH-FS (EP3)

except for one lab.

EP0 (tandem)

G.728 (not tested)

Equivalent to G.721

G.728 (not tested)

Equivalent to G.721

EP1 (tandem)

Equivalent to TCH-FS (EP1)

Better than TCH-FS (EP1 tandem)

Vehicle 10

Equivalent to G.728

Equivalent to G.728

Music 20

Equivalent to G.728

Equivalent to G.728

Better than TCH-FS

Male Talkers

Equivalent to G.728

Equivalent to G.728

Female Talkers

Equivalent to G.728

Equivalent to G.728

Children

Equivalent to G.728

Equivalent to G.728

 

Figure 15: General trend of the EFR behaviour for error conditions in noise-free environment


Table A.1.1: Q values and Differential Q (dB) values from References for error and tandeming conditions (BT/lab1, Mod. IRS input characteristics – SEG‑4, Exp#1 and Exp#5)

Conditions

Differential Q Values

(High Ref)

Differential Q Values

(Low Ref)

 Q Values

EFR

 Q Values

High Ref.

 Q Values

Low Ref.

EP0

+3,71

+3,71

29,86

26,15

26,15

EP1

-2,42

+2,96

21,58

24

18,62

EP2

-2,97

+0,96

15,65

18,62

14,69

EP3

-11,30

-0,55

0,41

11,71

0,96

EP0 (tandem)

-

-

-

22,94

22,94

EP1 (tandem)

--2,72

+1,26

15,90

18,62

14,64

 

Table A.1.2: Q values and Differential Q (dB) values from References for error and tandeming conditions (CNET/lab2, Mod. IRS input characteristics – SEG‑4, Exp#1 and Exp#5)

Conditions

Differential Q Values

(High Ref)

Differential Q Values

(Low Ref)

 Q Values

EFR

 Q Values

High Ref.

 Q Values

Low Ref.

EP0

+12,59

+12,59

39,06

26,47

26,47

EP1

0 / -1,33

+6,14

22,67

22,67 / 24

16,53

EP2

+0,15

+2,32

16,68

16,53

14,36

EP3

-11,95

+1,21

2,41

14,36

1,20

EP0 (tandem)

-

-

-

25,71

25,71

EP1 (tandem)

+2,22

+5,29

18,75

16,53

13,46

 

Table A.1.3: Q values and Differential Q (dB) values from References for error and tandeming conditions (TD/lab3, Mod. IRS input characteristics – SEG‑4, Exp#1 and Exp#5)

Conditions

Differential Q Values

(High Ref)

Differential Q Values

(Low Ref)

Q Values

EFR

 Q Values

High Ref.

 Q Values

Low Ref.

EP0

+1,98

+1,98

28,66

26,68

26,68

EP1

+2,74 / +2,60

+7,06

26,60

23,86 / 24

19,54

EP2

-1,53

+2,50

18,01

19,54

15,51

EP3

-15,33

> +0,18

0,18

15,51

< 0

EP0 (tandem)

-

-

-

23,66

23,66

EP1 (tandem)

+0,76

+6,06

20,30

19,54

14,24

 

Table A.1.4: Q values and Differential Q (dB) values from References for error and tandeming conditions (NEC/lab4, Mod. IRS input characteristics – SEG‑4, Exp#1 and Exp#5)

Conditions

Differential Q Values

(High Ref)

Differential Q Values

(Low Ref)

Q Values

EFR

Q Values

High Ref.

Q Values

Low Ref.

EP0

+3,70

+3,70

26,32

22,62

22,62

EP1

-1,50

+5,50

22,50

24

17,00

EP2

+4,63

+6,76

21,63

17,00

14,87

EP3

-10,49

+2,70

4,38

14,87

1,68

EP0 (tandem)

-

-

-

19,32

19,32

EP1 (tandem)

+2,92

+8,49

19,92

17,00

11,43

 

Table A.1.5: Q values and Differential Q (dB) values from References for error and tandeming conditions (MOTOROLA/lab5, Mod. IRS input characteristics)

Conditions

Differential Q Values

(High Ref)

Differential Q Values

(Low Ref)

Q Values

EFR

Q Values

High Ref.

Q Values

Low Ref.

EP0

-

-

24,82

?

-

EP1

-4,41

+3,79

19,59

24

15.80

EP2

-1,17

+3,35

14,63

15,80

11,28

EP3

-7,23

> +4,05

4,05

11,28

< 0

EP0 (tandem)

-

-

-

-

-

EP1 (tandem)

-

-

-

15,80

-

 

Table A.1.6: Q values and Differential Q (dB) values from References
for error and tandeming conditions (COMSAT/lab6)

Conditions

Differential Q Values

(High Ref)

Differential Q Values

(Low Ref)

 Q Values

EFR

 Q Values

High Ref.

 Q Values

Low Ref.

EP0 – (flat input)

+1,39

+1,39

31,03

29,64

29,64

EP1 (Mod. IRS)

~ +2,79

> +5,86

> 25

-(24)

19,14

EP2 (Mod. IRS)

+1,03

+4,15

20,17

19,14

14,99

EP3

-

-

 

14,99

-

EP0 (tandem) – (flat input)

(G.728)

+2,35 (G.721)

(G.728)

+2,35 (G.721)

28,78

(G.728)

26,43 (G.721)

(G.728)

26,43 (G.721)

EP1 (tandem) – (flat input)

-

-

-

19.14

-

Extra Conditions

(not included in SEG‑4, High and Low references not formally defined)

 

 

 

G.721

(same condition)

TCH-FS

(same condition)

EP0 –16 dBmOL – (flat input)

+2,31 (G.721)

+7,80

34,40

32,09 (G.721)

27,32

EP0 –36 dBmOL – (flat input)

-0,61 (G.721)

+2,41

25,08

25,69 (G.721)

22,67

C/I 10 dB, 1.5 mph (Mod. IRS)

 

> +5,99

> 25

 

19,01

C/I 13 dB (Mod. IRS)

 

> +4,04

> 25

 

20,96

C/I 13 dB tandem (Mod. IRS)

 

> +9,80

> 25

 

15,20

EP1 tandem EFR/TCH-FS – (flat)

 

-

24,46

 

-

EP1 tandem EFR/G.721 – (flat)

 

+2,93

27,36

 

24,43

 

Differences compared to the SEG‑4:  Different input characteristics (flat, except for error conditions), Additional input levels, tandemings and standards, G.721 as extra High Reference, Different MNRU selection, Separate experiment for error conditions (Non static, no frequency hopping 10 and 7 dB C/I, 30 mph, typical urban multipath, Mod. IRS input characteristics, MNRUmax = 25), No EP3 experiment.

Table A.1.7: Q values and Differential Q (dB) values from References
for error and tandeming conditions (NOKIA/lab7)

Conditions

Differential Q Values

(High Ref)

Differential Q Values

(Low Ref)

 Q Values

EFR

 Q Values

High Ref.

 Q Values

Low Ref.

EP0

> +2,12

> +2,12

> 30

27,88

27,88

EP1

~ -3

+14,79

27,88

-

(MNRU25 31,97)

13,09

EP2

+4,90

+8,65

17,99

13,09

9,34

EP3

-7,49

> +1,85

1,85

9,34

< 0

EP0 (tandem)

-

-

-

21,85

21,85

EP1 (tandem)

+5,63

+7,99

18,72

13,09?

10,73

Extra conditions

(not included in SEG‑4)

 

 

 

 

 

C/I 13 dB

-

> 14,91

> 30

-

15,09

 

Table A.1.8: Q values and Differential Q (dB) values from References for error and tandeming conditions (TI/lab8, Mod. IRS input characteristics –SEG‑4, Exp#1 and Exp#5)

Conditions

Differential Q Values

(High Ref)

Differential Q Values

(Low Ref)

 Q Values

EFR

 Q Values

High Ref.

 Q Values

Low Ref.

EP0

+2,36

+2,36

20,41

18,05

18,05

EP1

-5,21

+5,15

18,79

24

13,64

EP2

-0,48

+2,60

13,16

13,64

10,56

EP3

-

-

-

10,56

-

EP0 (tandem)

-

-

-

17,18

17,18

EP1 (tandem)

+1,03

+5,16

14,67

13,64

9,51

 

Table A.2.1: DMOS (and CI) values for EFR codec, G.728 Reference and TCH-FS
(for lab1 to lab4, flat input characteristics – SEG‑4, Exp#2
and Exp#3)

Conditions

Lab1

BT

Lab2

CNET

Lab3

TD

Lab4

NEC

EFR Vehicle 10

4,36 (0,17)

4,49 (0,12)

4,26 (0,16)

4,44 (0,18)

EFR Music 20

4,29 (0,15)

4,55 (0,11)

4,20 (0,14)

4,48 (0,18)

G.728 Vehicle 10

4,54 (0,15)

4,47 (0,14)

4,59 (0,13)

4,48 (0,14)

G.728 Music 20

4,46 (0,13)

4,52 (0,17)

4,24 (0,11)

4,52 (0,16)

TCH-FS Vehicle 10

4,20 (0,17)

4,50 (0,11)

4,16 (0,16)

4,06 (0,19)

TCH-FS Music 20

3,36 (0,15)

3,47 (0,15)

3,11 (0,15)

3,31 (0,20)

 

Table A.2.2: DMOS (and CI) values for EFR codec, G.728 Reference and extra Standards
(for lab5 to lab8, flat input characteristics

Conditions

Lab6/Comsat

(1) (2)

Lab7/Nokia

(1)

Differences compared to SEG‑4:

EFR Vehicle 10

-

4,47 (0,12)

 

EFR Music 20

-

4,57 (0,10)

1) Different selection of

G.728 Vehicle 10

-

4,45 (0,12)

MNRUs with noise added

G.728 Music 20

-

4,46 (0,11)

for Lab6 and Lab7.

TCH-FS Vehicle 10

-

3,75 (0,15)

2) Different noise types,

TCH-FS Music 20

-

3,54 (0,17)

G.721 as High Reference,

Extra Conditions

(not included in SEG‑4)

 

 

Additional standards for Lab6.

EFR Home 20 dB

4,79 (0,08)

-

 

EFR Vehicle 15 dB

4,61 (0,10)

-

 

EFR Vehicle 25 dB

4,65 (0,09)

-

 

EFR Street 10 dB

4,41 (0,13)

-

 

EFR Office 20 dB

4,66 (0,10)

-

 

TCH-FS Home 20 dB

4,35 (0,12)

-

 

TCH-FS Vehicle 15 dB

4,06 (0,13)

-

 

TCH-FS Vehicle 25 dB

4,15 (0,14)

-

 

TCH-FS Street 10 dB

3,54 (0,18)

-

 

TCH-FS Office 20 dB

3,86 (0,15)

-

 

G.721 Home 20 dB

4,67 (0,11)

-

 

G.721 Vehicle 15 dB

4,56 (0,11)

-

 

G.721 Vehicle 25 dB

4,65 (0,10)

-

 

G.721 Street 10 dB

3,90 (0,17)

-

 

G.721 Office 20 dB

4,49 (0,12)

-

 

 

Table A.3.1: DMOS (and SD) for EFR codec and G.728 for talker dependency
(lab1 to lab4, flat, - SEG‑4, Exp#4)

Conditions

Lab1

BT

Lab2

CNET

Lab3

TD

Lab4

NEC

EFR Male Talkers

4,89 (0,38)

4,70 (0,46)

4,77 (0,45)

4,41 (0,73)

EFR Female Talkers

4,91 (0,29)

4,65 (0,56)

4,81 (0,47)

4,49 (0,65)

EFR Children

4,82 (0,39)

4,65 (0,53)

4,83 (0,43)

4,48 (0,71)

G.728 Male Talkers

4,56 (0,59)

4,32 (0,57)

4,34 (0,61)

4,36 (0,74)

G.728 Female Talkers

4,61 (0,59)

4,41 (0,55)

4,36 (0,56)

4,35 (0,74)

G.728 Children

4,80 (0,46)

4,40 (0,52)

4,38 (0,57)

4,50 (0,71)

 

Table A.3.2: DMOS (and SD) for EFR codec and G.728 for talker dependency (lab7, flat)

Conditions

EFR

G.728

Male Talkers

4,73 (0,51)

4,49 (0,57)

Female Talkers

4,64 (0,50)

4,43 (0,56)

Children

4,62 (0,59)

4,37 (0,58)

 

Differences compared to SEG‑4:

      Different selection of MNRUs, extra condition (TCH-FS), 16 listeners instead of 24


 

Change history

SMG No.

Tdoc. No.

CR. No.

Section affected

New version

Subject/Comments

SMG#19

 

 

 

5.0.0

Phase 2+ version

SMG#22

 

 

 

4.0.0

Phase 2 version

SMG#27

 

 

 

6.0.0

Release 1997 version

SMG#29

 

 

 

7.0.0

Release 1998 version

SMG#31

 

 

 

8.0.0

Release 1999 version

 

 

Change history

Date

TSG #

TSG Doc.

CR

Rev

Subject/Comment

Old

New

03-2001

11

 

 

 

Version for Release 4

 

4.0.0

06-2002

16

 

 

 

Version for Release 5

4.0.0

5.0.0

12-2004

26

 

 

 

Version for Release 6

5.0.0

6.0.0

06-2007

36

 

 

 

Version for Release 7

6.0.0

7.0.0

12-2008

42

 

 

 

Version for Release 8

7.0.0

8.0.0

 

Version Control

Version Control

Toto je jediná verze této specifikace.

v800

Download & Access

Technical Details

AI Classification

Category: 7. Testování a interoperabilita
Subcategory: 7.1 Conformance Testing
Function: Test specification

Version Information

Release: Rel-8
Version: 800
Series: 46_series

Document Info

Type: Technical Specification
TSG: Services and
WGs:
RAN

Keywords & Refs

Keywords:
GSM

Partners

Contributors:
ETSITTCCCSA+3

File Info

File: 46055-800
Processed: 2025-06-22

3GPP Spec Explorer - Enhanced specification intelligence