Performance characterization of the GSM Enhanced Full Rate (EFR) speech codec
Specification: 46055
Summary
This document reports the results from the Pre-selection and Verification Phase of testing of the Enhanced Full Rate speech codec.
Specification Intelligence
This is a Technical Document in the Unknown Series series, focusing on Technical Document. The document is currently in approved by tsg and under change control and is under formal change control.
Classification
Specifics
Version
Full Document v800
3GPP TR 46.055 V8.0.0 (2008-12) |
Technical Specification |
3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; Performance characterization of the GSM Enhanced Full Rate (EFR) speech codec (Release 8)
|
|
The present document has been developed within the 3rd
Generation Partnership Project (3GPP TM) and may be further
elaborated for the purposes of 3GPP.    |
|
Keywords GSM, speech, codec |
3GPP Postal address
3GPP support office address 650 Route des Lucioles - Sophia Antipolis Valbonne - FRANCE Tel.: +33 4 92 94 42 00 Fax: +33 4 93 65 47 16 Internet http://www.3gpp.org |
Contents
Foreword................................................................................................................................................ 4
Introduction............................................................................................................................................ 5
1....... Scope........................................................................................................................................... 7
2....... References.................................................................................................................................... 7
3....... Abbreviations................................................................................................................................ 7
4....... Quality under error (EP0 â EP3) and tandeming conditions (Exp Number 1 and Exp Number 5)........ 8
5....... Quality under background noise conditions (Exp Number 2 and Exp Number 3)............................... 9
6....... Talker dependency (Exp Number 4)............................................................................................... 9
7....... DTX system.................................................................................................................................. 9
7.1......... Channel activity in DTX mode......................................................................................................................................... 9
7.1.1........... Test procedure............................................................................................................................................................... 9
7.1.2........... Speech channel activity............................................................................................................................................... 9
7.1.3........... Level compensation...................................................................................................................................................... 9
7.1.4........... Interleaving compensation........................................................................................................................................ 10
7.1.5........... Estimated mean TDMA channel activity............................................................................................................... 10
7.2......... DTX/CNI Informal Expert Listening tests................................................................................................................... 10
7.2.1........... Introduction................................................................................................................................................................. 10
7.2.2........... Test environment........................................................................................................................................................ 10
7.2.3........... Results.......................................................................................................................................................................... 10
8....... Performance with DTMF tones.................................................................................................... 10
8.1......... Introduction........................................................................................................................................................................ 10
8.2......... Test environment.............................................................................................................................................................. 11
8.3......... Results................................................................................................................................................................................. 11
9....... Network information tones........................................................................................................... 12
10..... Performance with special input signals.......................................................................................... 12
10.1....... Music signals..................................................................................................................................................................... 12
10.2....... Noise signals...................................................................................................................................................................... 13
11..... Performance with different languages........................................................................................... 13
12..... Delay.......................................................................................................................................... 14
13..... Frequency response..................................................................................................................... 17
13.1....... Introduction........................................................................................................................................................................ 17
13.2....... Test environment.............................................................................................................................................................. 17
13.3....... Results................................................................................................................................................................................. 17
14..... Complexity................................................................................................................................. 18
15..... Summary of the results from the subjective testing........................................................................ 19
Annex A:........ Summary of results (lab by lab)................................................................................... 21
A.1... Quality under Error and tandeming conditions............................................................................... 21
A.2... Quality under Background noise conditions.................................................................................. 23
A.3... Quality for Talker Dependency (DMOS and SD)........................................................................... 24
Annex B:........ Change history............................................................................................................ 25
This Technical Specification has been produced by the 3rd Generation Partnership Project (3GPP).
The contents of the present document are subject to continuing work within the TSG and may change following formal TSG approval. Should the TSG modify the contents of the present document, it will be re-released by the TSG with an identifying change of release date and an increase in version number as follows:
Version x.y.z
where:
x   the first digit:
1Â Â Â presented to TSG for information;
2Â Â Â presented to TSG for approval;
3Â Â Â or greater indicates TSG approved document under change control.
y   the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, updates, etc.
z   the third digit is incremented when editorial only changes have been incorporated in the document.
The SMG2-Speech experts Group (SEG) started its activity early in 1995 for the standardization of an Enhanced Full Rate speech codec. The Group produced a test plan for the first phase of testing (pre-selection phase) which is described in permanent document SEGâ4 (ETSI SMG2 SEG: SEGâ4 (v 1.0) "A Subjective Pre-Selection Test Plan for the Enhanced Full Rate Speech Coding Algorithm") to assess the performance of the submitted candidates. This test plan is based on the general knowledge coming from past ITUâT and ETSI activities on codec evaluation (GSM half rate and ITUâT 8 kbit/s recent exercises for instance). At the end of this Pre-selection Phase, SMG decided to standardize the PCS 1 900 codec, known as the USâ1 codec and no formal characterisation testing has been performed for the selected codec.
The present document therefore reports the results from the Pre-selection and Verification Phase of testing only. Consequently, the results reported here are less detailed, and the confidence intervals for them are wider, than those obtained for the GSM half rate standardization (GSM 06.08, [3]) where specific and detailed characterisation testing was performed. In addition, not all laboratories followed the same pre-selection test plan, further complicating the interpretation of the results.
The following experiments included in SEGâ4 were carried out by several laboratories in the Pre-selection Phase:
-Â Â Â Â Experiment 1: Quality under error and tandeming conditions (A-law, Modified IRS);
-Â Â Â Â Experiment 2: Quality under background noise conditions (Vehicular noise, UPCM, NoIRS);
-Â Â Â Â Experiment 3: Quality under background noise conditions (Background music, UPCM, NoIRS);
-Â Â Â Â Experiment 4: Talker Dependency (UPCM, NoIRS);
-Â Â Â Â Experiment 5: Quality under high error conditions âEP3 (A-law, Modified IRS).
A practical 'indirect' method of performance comparison between different results was adopted utilising the Modulated Noise Reference Unit (MNRU) (see note) as a reference degradation. The MNRU provides the additional function of allowing normalisation of results across different laboratories carrying out the same experiment, through the conversion of MOS scores to Equivalent Q (dB). The Q (dB) values introduced in a test normally range from 0 to 50 dB. In SEGâ4, both Experiment#1 and Experiment#5 on error conditions covers this range, the other experiments do not.
NOTE:Â Â Â Â Â The MNRU is a device designed for producing speech correlated noise that sounds subjectively like the quantising noise produced by log-companded PCM codecs. The device is subjectively calibrated for Mean Opinion Scores (MOS) against Q dB (where Q is the ratio of the speech to speech-correlated noise power). The 'Equivalent Q' of the codecs under test can be found from the corresponding MOS on the calibration curve of the MNRU (S-shaped curve).
Only four laboratories ran tests which followed the Pre-selection Test Plan described in SEGâ4 (BT/lab1, CNET/lab2, Tele Denmark/lab3, NEC/lab4). MOTOROLA/lab5 participated in the Pre-selection Phase but their experiments did not comply with SEGâ4. TI/lab8 ran one experiment only from SEGâ4. Results produced by COMSAT/lab6 following a NOKIA-designed test plan are part of standardization of the codec in North America and NOKIA/lab7 performed complementary experiments during the ETSI Pre-selection Phase.
As no further analysis have been undertaken to allow the averaging of scores across the different laboratories, results are reported in the annex on a laboratory-by-laboratory basis. For error and tandeming conditions, results are reported in terms of Equivalent Q (dB) values. For background noise conditions and talker dependency, results are reported in terms of DMOS values with either Confidence Interval (CI) or Standard Deviation (SD) as there is insufficient data available to normalise across laboratories via MNRU conditions.
The quality performance of the EFR codec is compared to High and Low references introduced in permanent documents SEGâ3 (ETSI SMG2 SEG: SEGâ3 "Selection Criteria for the Enhanced Full Rate Speech Coding Algorithm â Speech Quality Requirements") and SEGâ4 (ETSI SMG2 SEG: SEGâ4 (v 1.0) "A Subjective Pre-Selection Test Plan for the Enhanced Full Rate Speech Coding Algorithm", Section 7). These references were chosen as representative of the "minimum" and "objective" performance targets respectively, and are reported in table 1.
Table 1: References per condition: High Ref., Low Ref. And G.728
EXPERIMENTS (SEGâ4) |
Conditions |
High Ref |
Low Ref |
EXP#1 |
EP0 |
G.728 |
G.728 |
EXP#1 |
EP1 |
MNRU 24 dB |
TCH-FS (EP1) |
EXP#1 |
EP2 |
TCH-FS (EP1) |
TCH-FS (EP2) |
EXP#5 |
EP3 |
TCH-FS (EP2) |
TCH-FS (EP3) |
EXP#1 |
EP0 (tandem) |
G.728 |
G.728 |
EXP#1 |
EP1 (tandem) |
TCH-FS (EP1) |
TCH-FS (EP1 tandem) |
EXP#2 |
Vehicle 10 |
G.728 |
G.728 |
EXP#3 |
Music 20 |
G.728 |
G.728 |
EXP#4 |
Male Talkers |
G.728 |
G.728 |
EXP#4 |
Female Talkers |
G.728 |
G.728 |
EXP#4 |
Children |
G.728 |
G.728 |
A figure showing the general trend of the EFR behaviour for error conditions in noise-free environment, compared to the high (G.728) and low (TCH-FS) references is added to individual laboratories' quantitative results (figure 15). The general quality performance of the EFR codec is summarised in table 15.
In the Verification Phase, the behaviour of the EFR codec under the following test conditions was tested:
-Â Â Â Â behaviour of the DTX System;
-Â Â Â Â performance with DTMF tones;
-Â Â Â Â performance with network information tones;
-Â Â Â Â performance with special input signals;
-Â Â Â Â performance with music signals;
-Â Â Â Â performance with noise signals;
-Â Â Â Â performance with different languages;
-Â Â Â Â delay of the TCH-EFR;
-Â Â Â Â frequency response;
-Â Â Â Â complexity.
The results of these tests are also included in this report under the respective clauses.
Furthermore, the EFR codec was checked for correct functioning for the following items:
-Â Â Â Â test of overload point;
-Â Â Â Â SID frame encoding;
-Â Â Â Â muting behaviour;
-Â Â Â Â idle channel behaviour.
No artefact or malfunctioning was detected for these items.
The present document gives background information on the performance of the GSM enhanced full rate speech codec. Experimental results from the Pre-selection and Verification tests carried out during the standardization process by the SEG (Speech Expert Group) are reported to give a more detailed picture of the behaviour of the GSM enhanced full rate speech codec under different conditions of operation.
The following documents contain provisions which, through reference in this text, constitute provisions of the present document.
· References are either specific (identified by date of publication, edition number, version number, etc.) or nonâspecific.
· For a specific reference, subsequent revisions do not apply.
For a non-specific reference, the latest version applies. In the case of a reference to a 3GPP document (including a GSM document), a non-specific reference implicitly refers to the latest version of that document in the same Release as the present document.
[1]Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â GSM 03.05: "Digital cellular telecommunications system (Phase 2+); Technical performance objectives".
[2]Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â GSM 03.50: "Digital cellular telecommunications system (Phase 2+); Transmission planning aspects of the speech service in the GSM Public Land Mobile Network (PLMN) system".
[3]Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â GSM 06.08: "Digital cellular telecommunications system (Phase 2+); Half rate speech; Performance of the GSM half rate speech codec".
[4]Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â GSM 06.10: "Digital cellular telecommunications system (Phase 2+); Full rate speech transcoding".
[5]Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â GSM 06.20: "Digital cellular telecommunications system (Phase 2+); Half rate speech transcoding".
For the purposes of the present document, the following abbreviations apply:
A/DÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Analogue to Digital
ADPCMÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Adaptive Differential Pulse Code Modulation
ACRÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Absolute Category Rating
BSCÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Base Station Controller
BTSÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Base Transceiver Station
C/IÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Carrier-to-Interferer ratio
CIÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Confidence Interval
CNIÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Comfort Noise Insertion
CRCÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Cyclic Redundancy Check
D/AÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Digital to Analogue
DATÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Digital Audio Tape
DCRÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Degradation Category Rating
DSPÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Digital Signal Processor
DTMFÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Dual Tone Multi Frequency
DTXÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Discontinuous Transmission for power consumption and interference reduction
EFRÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Enhanced Full Rate
ESPÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Product of E (Efficiency), S (Speed) and P (Percentage of Power) of the DSP
FRÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Full Rate
GBERÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Average gross bit error rate
GSMÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Global System for Mobile communications
HRÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Half Rate
IRSÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Intermediate Reference System, No IRS= rather flat
ITUâTÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â International Telecommunication Union â Telecommunications Standardization Sector
MNRUÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Modulated Noise Reference Unit
Mod. IRSÂ Â Â Â Â Â Â Â Â Â Â Â Â Modified IRS
MOPSÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Million of Operation per Seconds
MOSÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Mean Opinion Score
MSÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Mobile Station
MSCÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Mobile Switching Centre
PCMÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Pulse Code Modulation
PSTNÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Public Switched Telecommunications Network
QÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Speech-to-speech correlated noise power ratio in dB
SDÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Standard Deviation
SEGÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Speech Expert Group
SIDÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Silence Descriptor
SMGÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Special Mobile Group
TCH-EFSÂ Â Â Â Â Â Â Â Â Â Â Â Â Traffic Channel Enhanced Full rate Speech
TCH-FSÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Traffic Channel Full rate Speech
TCH-HSÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Traffic Channel Half rate Speech
TDMAÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Time Division Multiple Access
TMOPSÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â True Million of Operation per Seconds
UPCMÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Uniform or Linear PCM
VADÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Voice Activity Detector
WMOPSÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Weighted Million of Operations per Seconds
Four different Error Patterns (EP0, EP1, EP2 and EP3) were used, where:
EP0Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â without channel errors
EP1Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â C/I=10 dB;Â Â Â Â 5% GBER (well inside a cell)
EP2Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â C/I= 7 dB;Â Â Â Â Â 8% GBER (at a cell boundary)
EP3Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â C/I= 4 dB;Â Â Â Â Â 13% GBER (outside a cell)
A listening-only test was adopted using the Absolute Category Rating (ACR) method. The results are reported in terms of Equivalent Q (dB) values and Differential Q values (which compare the codec results to the High and Low references). For error and tandeming conditions, results are available from eight laboratories (lab1 to lab8). Tables of results on a lab-by-lab basis are shown in the annex of the present document (table A.1.1 to table A.1.8), negative values indicating worse performance than the reference.
In general, across all laboratories, the EFR codec performs better than the reference TCH-FS for clear speech (EP0), for error conditions EP1 and EP2 and for tandeming under error EP1 conditions. For severe error condition (EP3), the performance is worse than TCH-FS in one laboratory. The EFR is equivalent to the reference G.728 (high reference) for clear speech in all laboratories. Under error conditions, the high reference threshold for severe error condition (EP3) is not met in all laboratories while the threshold for EP1 and EP2 is met for, roughly, half of the laboratories. Under tandeming, the clear condition was tested in only one laboratory where it was compared to another standard G.721; the results indicate that the performance of the EFR (EP0 tandem) is equivalent to that of G.721 (EP0). For tandeming under error condition EP1, equivalence with TCH-FS (EP1) without tandeming is demonstrated in all laboratories except one. Additional results coming from one lab only can be found in table A.1.6 (effect of input levels, other error conditions, tandeming with other standards).
The advantage of the EFR compared to the actual TCH-FS is not independent of the quality of the network. As channel errors increase, this advantage is reduced. The general trend of the EFR behaviour in error conditions is shown in figure 15.
This was assessed with a listening-only test, using the Degradation Category Rating (DCR) method. The results are reported for the EFR codec, the Reference G.728 and the TCH-FS codec in terms of DMOS values with Confidence Interval (CI). Six laboratories (lab1 to lab4, lab6 and lab7) performed this experiment, the first four complying with SEGâ4 (see table A.2.1 and table A.2.2).
For each laboratory, the differences in DMOS scores between the EFR codec and the Reference G.728 are of the same order as the confidence intervals for the EFR codec results, with the exception of one point (vehicle noise) in one laboratory. From this, it can be concluded that the performance of the EFR codec, under the background noise conditions tested is equivalent to that of the quality reference G.728 for all laboratories and also to G.721 (tested in one lab only). The degradation introduced by the EFR codec compared to the DIRECT connection in background noise conditions is rated between "unnoticeable" and "noticeable but not annoying". A substantial improvement is achieved over the full rate with music in the background. Additional results from one laboratory can be found in table A.2.2.
A listening-only test was used with the Degradation Category Rating (DCR) methodology. Results, available from five laboratories (lab1 to lab4 and lab7), are reported in terms of DMOS values with their associated Standard Deviation (SD) to give a measure of the spread of the scores about the averages for each gender for both the EFR codec and the Reference G.728. These experiments clearly show that the standard deviation of the scores of EFR codec for each gender is smaller than the standard deviation of the reference G.728 in each laboratory. The talker dependency performance for the EFR codec is therefore equivalent to that of G.728. Also, the gender dependency is equivalent to that of the G.728 codec. Tables of results lab-by-lab are shown in the annex (table A.3.1 to table A.3.2).
7.1Â Â Â Â Â Â Â Channel activity in DTX mode
7.1.1Â Â Â Â Â Â Test procedure
A carefully selected subset of the speech material recorded for testing the half rate DTX system was processed through the codec/DTX C-language simulation. This material comprised 48 real conversations in the English, German and Italian languages. The channel activity of the system was measured for all 48 conversations, and the mean channel activity was then calculated.
7.1.2Â Â Â Â Â Â Speech channel activity
The percentage of speech frames scheduled for transmission by the radio subsystem (subsequently referred to as the speech channel activity) varied significantly between conversations. Speech channel activities ranged from 29% to 93% for individual sides of a conversation. For this reason, it was not possible to identify any significant trends in the results with regard to terminal type and environmental conditions. The mean speech channel activity, measured over all 48 conversations, was 61 %.
7.1.3Â Â Â Â Â Â Level compensation
After calculating the mean speech channel activity, it was found that the speech material had been processed at a level 6,5 dB below the original recorded level. However, the activity of the basic VAD algorithm rises approximately 0,5 per cent per dB increase in input level. To compensate for this, a factor of 3 % must be added to the speech channel activity estimate.
7.1.4Â Â Â Â Â Â Interleaving compensation
The channel activity measurements were calculated on a signal frame basis. However, the use of interleaving (depth 4) implies that the TDMA activity will be approximately 2 % higher than the signal frame activity.
7.1.5Â Â Â Â Â Â Estimated mean TDMA channel activity
The estimated mean TDMA channel activity is shown in table 7.1.5.1.
Table 7.1.5.1: Calculation of mean TDMA channel activity
speech channel activity |
61 % |
level compensation |
3 % |
interleaving compensation |
2 % |
total TDMA channel activity |
66 % |
7.2Â Â Â Â Â Â Â DTX/CNI Informal Expert Listening tests
7.2.1Â Â Â Â Â Â Introduction
To check the performance of the DTX / CNI system of the ETSI GSM EFR codec, informal expert listening tests were done in Italian and German language. Also a very brief check of English speech samples was done. Special attention was given to clipping effects and noise.
7.2.2Â Â Â Â Â Â Test environment
Out of the speech samples from the HR codec DTX tests, 8 conversations were selected by CSELT, Deutsche Telekom and British Telecom, respectively. These samples were processed by Nokia and recorded on a DAT, one track without VAD/DTX processing and one track with the DTX / CNI system. By comparing the non-DTX and DTX speech, the listeners could judge the degradation to be not noticeable, minor, moderate or severe. It was allowed to rewind the tape to repeat listening to critical sections. The listening device was a high quality head set in mono operation to have either track 0 or track 1 signal on both speakers.
7.2.3Â Â Â Â Â Â Results
In all the speech samples, only two clippings were judged to be noticeable. On comfort noise insertion, conversations with almost no or low background noise were found to have no noticeable degradation. With increasing background noise, the noise related degradation was judged from minor to moderate (the latter in two sections of two conversations). The overall performance of the DTX / CNI system was seen to be fully satisfactory with mostly no or minor degradation.
8.1Â Â Â Â Â Â Â Introduction
A desirable requirement for the GSM Enhanced Full Rate speech codec is a DTMF transparency not worse than the GSM Full Rate codec. For the verification of the ETSI Enhanced Full Rate codec, the DTMF transmission was tested.
8.2Â Â Â Â Â Â Â Test environment
A DSP (NEC µPD77016) based PC board was used to measure the transmission of the codec under test. The DTMF software is derived from the Goertzel algorithm which allows to calculate the spectral powers of distinctive frequencies by means of a recursive digital filter scheme. The DTMF signal detection is based on "quality factors" calculated from the Distinctive Frequency Test results. Within a wide dynamic range this technique is independent from an absolute signal level. Based on the same hardware and software, PTT approvals are available with equipment of European Telecom houses.
DTMF signals were tested only under ideal transmission conditions. Error patterns like in the half rate case were not simulated. In the different experiments the input signals were modified in tone and pause length, amplitude (also introducing twist, i.e. different amplitude in the two components of the tone) and frequency. In all experiments 10 tones were input to the codec. The resulting files were processed by the DTMF detector. As the minimum tone length specified for an input signal of a detector is 80 ms while the minimum output length of a DTMF generator may be smaller, a test was also done with a 60 ms tone to the codec.
8.3Â Â Â Â Â Â Â Results
The test results shown in table 8.3.1 represent the detected tones from the 10 input signals. Table 8.3.2 summarises the test conditions. With input signals fully in the specified range no detection problems were observed. The shortest allowed input signal to a transmission line (80 ms) was detected 100 % in all experiments with different input levels, twist and frequency deviations. A strange effect known from the HR codec tests with long tones detected as two tones was not observed. Only in case of tones shorter than 80 ms the detection rate was down to 96 %, without a sharp decrease and without a distinct tone showing problems.
As a conclusion, the codec is tested to be 100 % transparent to DTMF signals under nominal conditions. Only tones shorter than minimum input specifications of 80 ms are not fully detected. The results are better compared to the FR codec. The requirement is fulfilled.
Table 8.3.1: Results of DTMF experiments
experiment tone |
N18 |
N22 |
N18â22 |
N22â26 |
D18 |
D18â22 |
L 120 |
L 200 |
S 60 |
 1 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
 2 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
9 |
 3 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
 4 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
9 |
 5 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
 6 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
 7 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
 8 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
 9 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
 0 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
 * |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
 # |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
9 |
 A |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
 B |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
9 |
 C |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
9 |
 D |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
10 |
total_d |
160 |
160 |
160 |
160 |
160 |
160 |
160 |
160 |
155 |
det_rate |
100 |
100 |
100 |
100 |
100 |
100 |
100 |
100 |
96 |
In rows 1 â D the number of detected tones from 10 inputs is shown
Table 8.3.2: Conditions of above listed experiments
experiment |
N18 |
N22 |
N18â22 |
N22â26 |
D18 |
D18â22 |
L 120 |
L 200 |
S 60 |
|
tone |
80 |
80 |
80 |
80 |
80 |
80 |
120 |
200 |
60 |
ms |
pause |
80 |
80 |
80 |
80 |
80 |
80 |
120 |
80 |
60 |
ms |
r_amp |
â18 |
â22 |
â18 |
â22 |
â22 |
â18 |
â22 |
â28 |
â22 |
dB |
c_amp |
â18 |
â22 |
â22 |
â26 |
â22 |
â22 |
â22 |
â28 |
â22 |
dB |
delta_f |
0 |
0 |
0 |
0 |
2 |
2 |
0 |
0 |
0 |
% |
r_amp and c_amp are the row amplitude and column amplitude respectively, dB values are relative to the overload point.
The signals shown in table 9 were first compressed by the encoder, then decompressed by the decoder, and then listened to via quality headphones using a high-quality PC audio card. The codec showed no perceivable degradation to the transmission of these PSTN network information tones both with and without the VAD/DTX system switched on. No clipping or other disturbing artefacts were noticed when DTX was enabled. Checking tones in use around the world as listed in ITU Recommendation E.180 Supplement 2 (Jan 94) indicated that this test achieves almost 100 % global coverage by simply testing UK, German, and USA tones.
Table 9: PSTN Information Tones Tested
German (no DTX) |
German (with DTX) |
3 dial tones |
2 dial |
1 ringing tone |
1 ringing |
2 busy tones (subscriber engaged) |
2 busy |
1 special information tone (number unobtainable) |
1 special information tone |
2 congestion tones (network equipment engaged) |
1 fax modem call setup tone sequence |
United Kingdom (no DTX) |
United Kingdom (with DTX) |
3 dial tones |
1 dial |
1 ringing tone |
1 ring |
1 busy tone (subscriber engaged) |
1 busy |
1 congestion tone (network equipment engaged) |
1 congest |
- |
1 sustained, low-level sinusoid (number unobtainable) |
USA (no DTX) |
USA (with DTX ⦠not tested) |
1 dial tone |
- |
1 ringing tone |
- |
1 busy tone (subscriber engaged) |
- |
1 special information tone |
- |
1 congestion tone (network equipment engaged) |
- |
Tones were computer generated for the tests in which DTX was switched off. Authentic DAT recordings of PSTN information tones were used to check the performance with DTX switched on, except the low-level sinusoid signal for "UK number unobtainable" which was computer generated.
Two kinds of special input signals have been chosen to be tested in the verification phase of the Enhanced Full Rate: music signals and noise signals.
10.1Â Â Â Â Â Music signals
This subclause reports on the informal listening tests conducted in CSELT to evaluate the performance of the EFR codec with music signals.
The tests have been based on informal pair comparison tests (A versus B without repetition) by considering the Full-rate codec, the Enhanced Full-Rate as well as the ITUâT ADPCM G.726 codec at 32 kbit/s. The tests involved 6 music items taken from those selected by ISO-MPEG to test audio codec standards. The duration of the different music items lasts in the range between 8 and 10 seconds. Music items have been downsampled to 8 kHz before processing. Listening was performed by 12 naive listeners through headphones.
The results are reported in tables 10.1.1 and 10.1.2.
Table 10.1.1: Results of the informal test on
performance
with music signals: Enhanced Full Rate versus Full Rate
Music Items |
Enhanced Full Rate preferred to Full Rate |
Enhanced Full Rate equal to Full Rate |
Full Rate preferred to Enhanced Full Rate |
Harpsichord |
100 % |
0 % |
0 % |
Carmen |
25 % |
41,7 % |
33,3 % |
Trumpet |
100 % |
0 % |
0 % |
Castanets |
33,3 % |
41,7 % |
25 % |
Mediterraneo |
41,7 % |
33.3 % |
25 % |
Vivaldi "The spring" |
100 % |
0 % |
0 % |
Total |
66,7 % |
19,4 % |
13,9 % |
Table 10.1.2: Results of the informal test on
performance
with music signals: Enhanced Full Rate versus ADPCM 32 kbit/s
Music Items |
Enhanced Full Rate preferred to ADPCM |
Enhanced Full Rate equal to ADPCM |
ADPCM preferred to Enhanced Full Rate |
Harpsichord |
50 % |
8,3 % |
41,7 % |
Carmen |
0 % |
25 % |
75 % |
Trumpet |
33,3 % |
33,3 % |
33,3 % |
Castanets |
8,3 % |
41,7 % |
50 % |
Mediterraneo |
16,7 % |
25 % |
58,3 % |
Vivaldi "The spring" |
16,7 % |
25 % |
58,3 % |
Total |
20,9 % |
26,4 % |
52,7 % |
The analysis of results shows a certain dependency of performance on the music item. There is at least one item in which the FR has been judged better than the EFR. Nevertheless, on the average, the EFR provides better performance than the FR, whilst it appears to perform worse than the ADPCM.
10.2Â Â Â Â Â Noise signals
To check the scaling performance of the fixed point algorithm a noise signal with levels ranging from -10 dB down to â70 dB have been processed by encoder and decoder in error free conditions. The level of the decoder output signal was examined. It was found that for all signals the reconstructed output level followed the input level. Even for very low signal levels no problems were detected.
This clause deals with the results of an informal listening test to evaluate the performance of the EFR for some languages which were not tested formally.
The tests have been based on informal pair comparison (A versus B without repetition) by considering the Full-rate codec, the Enhanced Full-Rate as well as the ITUâT ADPCM G.726 codec at 32 kbit/s. The tests involved 5 different languages (Arab, Chinese, Japanese, Polish and Portuguese). Listening and recording was performed by naive, mother tongue people. For most languages, however, it was possible to use only one listener thus suggesting to take the results with the due caution.
The test was performed by collecting people of different mother-tongue at CSELT premises. Subjects were asked to record a list of sentences in their own languages. The sentence length was in the range from 4 to 6 seconds. The list of languages, number of listeners and samples is reported in table 11.1.
Table 11.1: List of languages and the number of listeners and sentences used
LANGUAGE |
NUMBER OF LISTENERS |
NUMBER OF SENTENCES |
Arab |
2 |
8 |
Chinese |
1 |
8 |
Japanese |
1 |
8 |
Polish |
2 |
8 |
Portuguese |
1 |
8 |
The subject were asked to listen to an A-B sequence and allowed to express a preference as well as to judge the perceived quality to be the same. The results of the test are reported in tables 11.2 and 11.3.
Table 11.2: Results of the informal test on
languages not covered
in the formal tests: Enhanced Full Rate versus Full Rate
Languages |
Enhanced Full Rate preferred to Full Rate |
Enhanced Full Rate equal to Full Rate |
Full Rate preferred to Enhanced Full Rate |
Arab |
37,5 % |
50 % |
12,5 % |
Chinese |
100 % |
0 % |
0 % |
Japanese |
100 % |
0 % |
0 % |
Polish |
68,7 % |
12,5 % |
18,8 % |
Portuguese |
75 % |
25 % |
0 % |
Table 11.3: Results of the informal test on
languages not covered
in the formal tests: Enhanced Full Rate versus ADPCM at 32 kbit/s
Languages |
Enhanced Full Rate preferred to ADPCM |
Enhanced Full Rate equal to ADPCM |
ADPCM preferred to Enhanced Full Rate |
Arab |
18,75 % |
75 % |
6,25 % |
Chinese |
87,5 % |
12,5 % |
0 % |
Japanese |
87,5 % |
12,5 % |
0 % |
Polish |
25 % |
37,5 % |
37,5 % |
Portuguese |
12,5 % |
50 % |
37,5 % |
The analysis of the results confirms the good performance of the Enhanced full-rate also for languages not considered in the formal experiments.
This seems to be the case for all the languages tested, even though the test size was very small. The EFR was always preferred in comparison to the Full-rate. For Chinese and Japanese the preference is stronger and, for these languages, the EFR is preferred also to the ADPCM at 32 kbit/s in most of the cases.
The round-trip delay of a communication using a TCH-EFS has been estimated taking into account all the system and processing delays.
The symbol definitions for the calculations in this section are:
Tabisd                  The time required to transmit the 260 speech frame data bits (bits D1 â D260, C16 and the 17 synchronization bits -> 278 bits) over the 16 kbit/s Aâbisâinterface in the downlink direction (system dependent).
Tabisu                  The time required to transmit the first 137 TRAU frame bits, the first 34 of which can be sent by anticicipation, leading to a delay of 103 TRAU frame bits (D2 â D98 speech frame data bits including the CRCs + 6 synchronization bits) over the 16 kbit/s Aâbisâinterface in the uplink direction (system dependent).
Tad                       Delay in the analogue to digital converter in the uplink.
Tbsc                      Switching delay in the BSC (implementation dependent).
Tbuff                    Due to the time alignment procedure for inband control of the remote transcoder in case of a 16 kbit/s Aâbisâinterface in the downlink direction, it is required to have a buffer in the BTS of 1 ms + one 250 s regulation step (system dependent).
Tda                       Delay in the digital to analogue converter in the downlink.
Techo                   Delay due to the echo canceller.
Tencode:Â Â Â Â Â Â Â Â Â Â Â Â Â Â The time required for the channel encoder to perform channel encoding (implementation dependent).
Tmsc                     Switching delay in the MSC.
Tpcm                    The duration of a segment of PCM speech for the downlink processing delay.
Tproc:Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â The time required after reception of the last encoded speech parameter of the first subframe (FCBâGain1) to process the speech encoded data for the enhanced full rate speech decoder and to produce the first PCM output sample (implementation dependent).
Trftx:Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â The time required for transmission of a TCH radio interface frame over the air interface due to the interleaving and de-interleaving (system dependent).
Trxproc:Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â The time required after reception over the radio interface to perform equalization, channel decoding and SID-frame detection (implementation dependent).
Tsample:Â Â Â Â Â Â Â Â Â Â Â Â Â Â The duration of the segment of PCM speech operated on by the speech transcoder.
Tsps                      Delay of the speech encoder in the BSC after reception of the last PCM sample until availability of the first encoded bit (implementation dependent).
Ttransc:Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â The MS speech encoder processing time, from input of the last PCM sample to output of the final encoded bit (implementation dependent).
The processing delays were estimated from the detailed complexity figure that has been previously computed in the verification phase. The complexity estimation is based on rules that are supposed to be relevant from an implementation point of view and independent from specific DSPs at the same time. Therefore it was tried to follow the same philosophy for the processing delays. The DSP that runs the codec has been modelled through three parameters E, S and P.
E stands for the Efficiency of the DSP. This corresponds to the ratio TMOPS/WMOPS of the implementation of the codec on the DSP.
S stands for the Speed of the DSP: Maximum Number of Operations that the DSP can run in 1 second. This number is expressed in MOPS.
P stands for the percentage of DSP processing power assigned to the codec.
The processing delay of a task whose complexity is X can then be computed using the formula:
                              D = X*20/ESP,
the time unit being ms.
The following assumptions were made when computing the round-trip delay:
-Â Â Â Â for the enhanced Full Rate MS delay, it is assumed that the DSP has the same performance as the DSP used for GSM HR [5];
-Â Â Â Â for the Enhanced Full Rate BSC delays, it is assumed that the DSP of the TRAU will have the same performance as the DSP used for GSM HR;
-Â Â Â Â for the Enhanced Full Rate BTS delay, it is assumed that the DSP will have the same performance as the DSP used for GSM FR [4]. The reason is that it is assumed that the GSM Full Rate BTS will be reused during first GSM EFR deployments;
-Â Â Â Â a 16 kbit/s submultiplexed A-bis is used between the BTS and the BSC-TRAU.
From these assumptions and following the complexity of GSM HR [3] and its delay requirement for the MS [2], the ESP value has been computed for EFR:
ESP = 25
The following list of delays provided in [1] and [2] for the GSM Full Rate and common to the GSM Enhanced Full rate are considered realistic and therefore retain the same value:
-Â Â Â Â MSCÂ Â Â Tmsc
            margin
-Â Â Â Â BSCÂ Â Â Â Tbsc
            margin
-Â Â Â Â BTSÂ Â Â Â Trxproc
            margin
-Â Â Â Â MSÂ Â Â Â Â Â Trftx
            Tda
The results of the estimation are provided in table 12.1 for uplink and table 12.2 for downlink. The time unit for all delays is ms (10-3 s).
Table 12.1: Uplink delay
Equipment |
Speed Parameter |
Delay (ms) |
Data |
MSC |
Tmsc |
0,5 |
|
|
margin |
0,5 |
|
BSC |
Tbsc |
0,5 |
|
|
Tproc |
1,27 |
1,59 WMOPS |
|
margin |
0,5 |
|
BTS |
Tabisu |
6,4375 |
103 bits |
|
Trxproc |
8,8 |
2,45 WMOPS (note) |
|
margin |
3 |
|
MS |
Trftx |
37,5 |
|
|
Tencode |
0,32 |
0,20 WMOPS |
|
Ttransc |
12,17 |
15,21 WMOPS |
|
Tsample |
20 |
|
|
Tmargin |
2 |
|
|
Tad |
1 |
|
SUM |
Uplink |
94,4975 |
|
NOTE:Â Â Â Â Â Â This theoretical complexity corresponds to the channel decoding only. This leaves 6,84 ms for the equaliser in Trxproc. |
Table 12.2: Downlink delay
Equipment |
Speed Parameter |
Delay (ms) |
Data |
MSC |
Techo |
1 |
|
|
Tmsc |
0,5 |
|
|
margin |
0,5 |
|
BSC |
Tbsc |
0,5 |
|
|
Tsample |
20 |
|
|
Tsps |
2,3 |
|
|
Tabisd |
17,375 |
278 bits |
|
margin |
0,5 |
|
BTS |
Tbuff |
1,25 |
|
|
Tencode |
1,60 |
0,20 WMOPS |
|
margin |
0,45 |
|
MS |
Trftx |
37,5 |
|
|
Trxproc |
8,8 |
2,45 WMOPS (note) |
|
Tproc |
1,27 |
1,59 WMOPS |
|
margin |
2 |
|
|
Tda |
1 |
|
SUM |
Downlink |
96,547 |
|
NOTE:Â Â Â Â Â Â This theoretical complexity corresponds to the channel decoding only. This leaves 6,84 ms for the equaliser in Trxproc. |
Round-trip delay = Uplink delay + Downlink delay = 191,04 ms
This delay is very close to the delay indicated in [1], [2] and [3] for GSM Full Rate: 188,5 ms. The difference should be unnoticeable.
13.1Â Â Â Â Â Introduction
A characteristic test in the verification of GSM speech codecs is the frequency response test. Sine tones in the telephony frequency band are input to the codecs, and after decoding the gain is calculated. It has to be pointed out that the frequency response measurement is given just as a piece of additional information and does not add information on the actual behaviour of the codec in terms of perceived quality or DTMF transparency.
13.2Â Â Â Â Â Test environment
The tones were calculated to a nominal level of 22 dB below the overload point. Tones ranging from 80 Hz to 3 600 Hz in steps of 21 Hz with a nominal length of 2 s were input to the codec under test. After decoding the gain was calculated with averaged results of 400 ms intervals and again averaged for the total duration of one frequency to get the frequency response curve. This was done to check the transition behaviour of the codec and eventually disregard the first samples.
13.3Â Â Â Â Â Results
Within the telephony band the frequency response is very flat. No abnormal deviations were observed. Also additional experiments with different input level (â18 dB, â28 dB), or different tone length (500 ms, 4 s) resulted in almost identical curves. The decreasing gain above 3 kHz is relative small and far away from a 3 dB margin. The transition behaviour was very good.
Figure 13: GSM EFR codec frequency response at different input levels
The complexity of the Enhanced Full Rate is characterised by the 3 following items:
-Â Â Â Â the number of cycles;
-Â Â Â Â the data memory size;
-Â Â Â Â the program memory size.
The values of these different figures depend on a specific DSP implementation. Nevertheless, the results obtained by the C description analysis can be used as references.
The speech transcoding functions are specified using a set of basic arithmetic operations. The WMOPS figure quoted is a weighted sum of the operations required to perform transcoding. The weight assigned to each operation is representative of the number of instruction cycles required to perform that operation on a typical DSP device.
The complexity range of the Enhanced Full Rate is equivalent to the Half Rate codec complexity.
The number of cycles required by the Enhanced Full Rate algorithm is relatively independent on the values of the input samples. The execution time of an average and an extreme input case are equivalent.
Nevertheless the following table presents the theoretical worst case evaluation, i.e. the maximum possible number of cycles, which is consistent with the results indicated in [3].
The following figures are associated to the Speech and Channel part excluding the DTX functions.
Table 14: Principal complexity figure
|
Theoretical |
Data RAM (16 bits words) |
Data ROM (constants) (16 bits words) |
Program ROM (assembly instructions) |
Enhanced Full Rate |
18,1 |
4 708 |
5 363 |
6 000 â 9 000 |
Half Rate |
21,2 |
5 002 |
8 781 |
8 000 â 12 000 |
NOTE:Â Â Â Â Â The Data RAM figure can be split in 2 parts: the static variables: 2 240 words; and the dynamic variables (i.e. local to a procedure ): 2 468 words.
The EFR codec is better than the actual FR codec for clear speech, for all error conditions (EP1, EP2 and EP3) and for tandeming under error EP1; it is equivalent to G.728 for its intrinsic quality, for background noise conditions and talker dependency. The EFR codec does not reach the objective performance target (TCH-FS EP2) for severe error condition EP3; for error conditions EP1 and EP2 it does not reach the objective performance target for half of the results. The EFR quality under tandeming condition without error was not tested against the target G.728 but is found equivalent to G.721. The advantage of the EFR compared to the TCH-FS is dependant of the quality of the network. As channel errors increase, this advantage is reduced.
Table 15: Summary of Results
Conditions |
High Ref |
Low Ref |
EP0 |
Equivalent to G.728 |
Equivalent to G.728 Better than TCH-FS |
EP1 |
Worse than MNRU 24 dB for half of labs |
Better than TCH-FS (EP1) |
EP2 |
Worse than TCH-FS (EP1) for half of labs |
Better than TCH-FS (EP2) |
EP3 |
Worse than TCH-FS (EP2)
|
Better than TCH-FS (EP3) except for one lab. |
EP0 (tandem) |
G.728 (not tested) Equivalent to G.721 |
G.728 (not tested) Equivalent to G.721 |
EP1 (tandem) |
Equivalent to TCH-FS (EP1) |
Better than TCH-FS (EP1 tandem) |
Vehicle 10 |
Equivalent to G.728 |
Equivalent to G.728 |
Music 20 |
Equivalent to G.728 |
Equivalent to G.728 Better than TCH-FS |
Male Talkers |
Equivalent to G.728 |
Equivalent to G.728 |
Female Talkers |
Equivalent to G.728 |
Equivalent to G.728 |
Children |
Equivalent to G.728 |
Equivalent to G.728 |
Figure 15: General trend of the EFR behaviour for error conditions in noise-free environment
Table A.1.1: Q values and Differential Q (dB) values from References for error and tandeming conditions (BT/lab1, Mod. IRS input characteristics â SEGâ4, Exp#1 and Exp#5)
Conditions |
Differential Q Values (High Ref) |
Differential Q Values (Low Ref) |
 Q Values EFR |
 Q Values High Ref. |
 Q Values Low Ref. |
EP0 |
+3,71 |
+3,71 |
29,86 |
26,15 |
26,15 |
EP1 |
-2,42 |
+2,96 |
21,58 |
24 |
18,62 |
EP2 |
-2,97 |
+0,96 |
15,65 |
18,62 |
14,69 |
EP3 |
-11,30 |
-0,55 |
0,41 |
11,71 |
0,96 |
EP0 (tandem) |
- |
- |
- |
22,94 |
22,94 |
EP1 (tandem) |
--2,72 |
+1,26 |
15,90 |
18,62 |
14,64 |
Table A.1.2: Q values and Differential Q (dB) values from References for error and tandeming conditions (CNET/lab2, Mod. IRS input characteristics â SEGâ4, Exp#1 and Exp#5)
Conditions |
Differential Q Values (High Ref) |
Differential Q Values (Low Ref) |
 Q Values EFR |
 Q Values High Ref. |
 Q Values Low Ref. |
EP0 |
+12,59 |
+12,59 |
39,06 |
26,47 |
26,47 |
EP1 |
0 / -1,33 |
+6,14 |
22,67 |
22,67 / 24 |
16,53 |
EP2 |
+0,15 |
+2,32 |
16,68 |
16,53 |
14,36 |
EP3 |
-11,95 |
+1,21 |
2,41 |
14,36 |
1,20 |
EP0 (tandem) |
- |
- |
- |
25,71 |
25,71 |
EP1 (tandem) |
+2,22 |
+5,29 |
18,75 |
16,53 |
13,46 |
Table A.1.3: Q values and Differential Q (dB) values from References for error and tandeming conditions (TD/lab3, Mod. IRS input characteristics â SEGâ4, Exp#1 and Exp#5)
Conditions |
Differential Q Values (High Ref) |
Differential Q Values (Low Ref) |
Q Values EFR |
 Q Values High Ref. |
 Q Values Low Ref. |
EP0 |
+1,98 |
+1,98 |
28,66 |
26,68 |
26,68 |
EP1 |
+2,74 / +2,60 |
+7,06 |
26,60 |
23,86 / 24 |
19,54 |
EP2 |
-1,53 |
+2,50 |
18,01 |
19,54 |
15,51 |
EP3 |
-15,33 |
> +0,18 |
0,18 |
15,51 |
< 0 |
EP0 (tandem) |
- |
- |
- |
23,66 |
23,66 |
EP1 (tandem) |
+0,76 |
+6,06 |
20,30 |
19,54 |
14,24 |
Table A.1.4: Q values and Differential Q (dB) values from References for error and tandeming conditions (NEC/lab4, Mod. IRS input characteristics â SEGâ4, Exp#1 and Exp#5)
Conditions |
Differential Q Values (High Ref) |
Differential Q Values (Low Ref) |
Q Values EFR |
Q Values High Ref. |
Q Values Low Ref. |
EP0 |
+3,70 |
+3,70 |
26,32 |
22,62 |
22,62 |
EP1 |
-1,50 |
+5,50 |
22,50 |
24 |
17,00 |
EP2 |
+4,63 |
+6,76 |
21,63 |
17,00 |
14,87 |
EP3 |
-10,49 |
+2,70 |
4,38 |
14,87 |
1,68 |
EP0 (tandem) |
- |
- |
- |
19,32 |
19,32 |
EP1 (tandem) |
+2,92 |
+8,49 |
19,92 |
17,00 |
11,43 |
Table A.1.5: Q values and Differential Q (dB) values from References for error and tandeming conditions (MOTOROLA/lab5, Mod. IRS input characteristics)
Conditions |
Differential Q Values (High Ref) |
Differential Q Values (Low Ref) |
Q Values EFR |
Q Values High Ref. |
Q Values Low Ref. |
EP0 |
- |
- |
24,82 |
? |
- |
EP1 |
-4,41 |
+3,79 |
19,59 |
24 |
15.80 |
EP2 |
-1,17 |
+3,35 |
14,63 |
15,80 |
11,28 |
EP3 |
-7,23 |
> +4,05 |
4,05 |
11,28 |
< 0 |
EP0 (tandem) |
- |
- |
- |
- |
- |
EP1 (tandem) |
- |
- |
- |
15,80 |
- |
Table A.1.6: Q values
and Differential Q (dB) values from References
for error and tandeming conditions (COMSAT/lab6)
Conditions |
Differential Q Values (High Ref) |
Differential Q Values (Low Ref) |
 Q Values EFR |
 Q Values High Ref. |
 Q Values Low Ref. |
EP0 â (flat input) |
+1,39 |
+1,39 |
31,03 |
29,64 |
29,64 |
EP1 (Mod. IRS) |
~ +2,79 |
> +5,86 |
> 25 |
-(24) |
19,14 |
EP2 (Mod. IRS) |
+1,03 |
+4,15 |
20,17 |
19,14 |
14,99 |
EP3 |
- |
- |
|
14,99 |
- |
EP0 (tandem) â (flat input) |
(G.728) +2,35 (G.721) |
(G.728) +2,35 (G.721) |
28,78 |
(G.728) 26,43 (G.721) |
(G.728) 26,43 (G.721) |
EP1 (tandem) â (flat input) |
- |
- |
- |
19.14 |
- |
Extra Conditions (not included in SEGâ4, High and Low references not formally defined) |
|
|
|
G.721 (same condition) |
TCH-FS (same condition) |
EP0 â16 dBmOL â (flat input) |
+2,31 (G.721) |
+7,80 |
34,40 |
32,09 (G.721) |
27,32 |
EP0 â36 dBmOL â (flat input) |
-0,61 (G.721) |
+2,41 |
25,08 |
25,69 (G.721) |
22,67 |
C/I 10 dB, 1.5 mph (Mod. IRS) |
|
> +5,99 |
> 25 |
|
19,01 |
C/I 13 dB (Mod. IRS) |
|
> +4,04 |
> 25 |
|
20,96 |
C/I 13 dB tandem (Mod. IRS) |
|
> +9,80 |
> 25 |
|
15,20 |
EP1 tandem EFR/TCH-FS â (flat) |
|
- |
24,46 |
|
- |
EP1 tandem EFR/G.721 â (flat) |
|
+2,93 |
27,36 |
|
24,43 |
Differences compared to the SEGâ4:Â Different input characteristics (flat, except for error conditions), Additional input levels, tandemings and standards, G.721 as extra High Reference, Different MNRU selection, Separate experiment for error conditions (Non static, no frequency hopping 10 and 7 dB C/I, 30 mph, typical urban multipath, Mod. IRS input characteristics, MNRUmax = 25), No EP3 experiment.
Table A.1.7: Q values
and Differential Q (dB) values from References
for error and tandeming conditions (NOKIA/lab7)
Conditions |
Differential Q Values (High Ref) |
Differential Q Values (Low Ref) |
 Q Values EFR |
 Q Values High Ref. |
 Q Values Low Ref. |
EP0 |
> +2,12 |
> +2,12 |
> 30 |
27,88 |
27,88 |
EP1 |
~ -3 |
+14,79 |
27,88 |
- (MNRU25 31,97) |
13,09 |
EP2 |
+4,90 |
+8,65 |
17,99 |
13,09 |
9,34 |
EP3 |
-7,49 |
> +1,85 |
1,85 |
9,34 |
< 0 |
EP0 (tandem) |
- |
- |
- |
21,85 |
21,85 |
EP1 (tandem) |
+5,63 |
+7,99 |
18,72 |
13,09? |
10,73 |
Extra conditions (not included in SEGâ4) |
|
|
|
|
|
C/I 13 dB |
- |
> 14,91 |
> 30 |
- |
15,09 |
Table A.1.8: Q values and Differential Q (dB) values from References for error and tandeming conditions (TI/lab8, Mod. IRS input characteristics âSEGâ4, Exp#1 and Exp#5)
Conditions |
Differential Q Values (High Ref) |
Differential Q Values (Low Ref) |
 Q Values EFR |
 Q Values High Ref. |
 Q Values Low Ref. |
EP0 |
+2,36 |
+2,36 |
20,41 |
18,05 |
18,05 |
EP1 |
-5,21 |
+5,15 |
18,79 |
24 |
13,64 |
EP2 |
-0,48 |
+2,60 |
13,16 |
13,64 |
10,56 |
EP3 |
- |
- |
- |
10,56 |
- |
EP0 (tandem) |
- |
- |
- |
17,18 |
17,18 |
EP1 (tandem) |
+1,03 |
+5,16 |
14,67 |
13,64 |
9,51 |
Table A.2.1: DMOS (and CI) values for
EFR codec, G.728 Reference and TCH-FS
(for lab1 to lab4, flat input characteristics â SEGâ4, Exp#2 and Exp#3)
Conditions |
Lab1 BT |
Lab2 CNET |
Lab3 TD |
Lab4 NEC |
EFR Vehicle 10 |
4,36 (0,17) |
4,49 (0,12) |
4,26 (0,16) |
4,44 (0,18) |
EFR Music 20 |
4,29 (0,15) |
4,55 (0,11) |
4,20 (0,14) |
4,48 (0,18) |
G.728 Vehicle 10 |
4,54 (0,15) |
4,47 (0,14) |
4,59 (0,13) |
4,48 (0,14) |
G.728 Music 20 |
4,46 (0,13) |
4,52 (0,17) |
4,24 (0,11) |
4,52 (0,16) |
TCH-FS Vehicle 10 |
4,20 (0,17) |
4,50 (0,11) |
4,16 (0,16) |
4,06 (0,19) |
TCH-FS Music 20 |
3,36 (0,15) |
3,47 (0,15) |
3,11 (0,15) |
3,31 (0,20) |
Table A.2.2: DMOS (and CI) values for
EFR codec, G.728 Reference and extra Standards
(for lab5 to lab8, flat input characteristics
Conditions |
Lab6/Comsat (1) (2) |
Lab7/Nokia (1) |
Differences compared to SEGâ4: |
EFR Vehicle 10 |
- |
4,47 (0,12) |
|
EFR Music 20 |
- |
4,57 (0,10) |
1) Different selection of |
G.728 Vehicle 10 |
- |
4,45 (0,12) |
MNRUs with noise added |
G.728 Music 20 |
- |
4,46 (0,11) |
for Lab6 and Lab7. |
TCH-FS Vehicle 10 |
- |
3,75 (0,15) |
2) Different noise types, |
TCH-FS Music 20 |
- |
3,54 (0,17) |
G.721 as High Reference, |
Extra Conditions (not included in SEGâ4) |
|
|
Additional standards for Lab6. |
EFR Home 20 dB |
4,79 (0,08) |
- |
|
EFR Vehicle 15 dB |
4,61 (0,10) |
- |
|
EFR Vehicle 25 dB |
4,65 (0,09) |
- |
|
EFR Street 10 dB |
4,41 (0,13) |
- |
|
EFR Office 20 dB |
4,66 (0,10) |
- |
|
TCH-FS Home 20 dB |
4,35 (0,12) |
- |
|
TCH-FS Vehicle 15 dB |
4,06 (0,13) |
- |
|
TCH-FS Vehicle 25 dB |
4,15 (0,14) |
- |
|
TCH-FS Street 10 dB |
3,54 (0,18) |
- |
|
TCH-FS Office 20 dB |
3,86 (0,15) |
- |
|
G.721 Home 20 dB |
4,67 (0,11) |
- |
|
G.721 Vehicle 15 dB |
4,56 (0,11) |
- |
|
G.721 Vehicle 25 dB |
4,65 (0,10) |
- |
|
G.721 Street 10 dB |
3,90 (0,17) |
- |
|
G.721 Office 20 dB |
4,49 (0,12) |
- |
|
Table A.3.1: DMOS (and
SD) for EFR codec and G.728 for talker dependency
(lab1 to lab4, flat, - SEGâ4, Exp#4)
Conditions |
Lab1 BT |
Lab2 CNET |
Lab3 TD |
Lab4 NEC |
EFR Male Talkers |
4,89 (0,38) |
4,70 (0,46) |
4,77 (0,45) |
4,41 (0,73) |
EFR Female Talkers |
4,91 (0,29) |
4,65 (0,56) |
4,81 (0,47) |
4,49 (0,65) |
EFR Children |
4,82 (0,39) |
4,65 (0,53) |
4,83 (0,43) |
4,48 (0,71) |
G.728 Male Talkers |
4,56 (0,59) |
4,32 (0,57) |
4,34 (0,61) |
4,36 (0,74) |
G.728 Female Talkers |
4,61 (0,59) |
4,41 (0,55) |
4,36 (0,56) |
4,35 (0,74) |
G.728 Children |
4,80 (0,46) |
4,40 (0,52) |
4,38 (0,57) |
4,50 (0,71) |
Table A.3.2: DMOS (and SD) for EFR codec and G.728 for talker dependency (lab7, flat)
Conditions |
EFR |
G.728 |
Male Talkers |
4,73 (0,51) |
4,49 (0,57) |
Female Talkers |
4,64 (0,50) |
4,43 (0,56) |
Children |
4,62 (0,59) |
4,37 (0,58) |
Differences compared to SEGâ4:
     Different selection of MNRUs, extra condition (TCH-FS), 16 listeners instead of 24
Change history |
|||||
SMG No. |
Tdoc. No. |
CR. No. |
Section affected |
New version |
Subject/Comments |
SMG#19 |
|
|
|
5.0.0 |
Phase 2+ version |
SMG#22 |
|
|
|
4.0.0 |
Phase 2 version |
SMG#27 |
|
|
|
6.0.0 |
Release 1997 version |
SMG#29 |
|
|
|
7.0.0 |
Release 1998 version |
SMG#31 |
|
|
|
8.0.0 |
Release 1999 version |
Change history |
|||||||
Date |
TSG # |
TSG Doc. |
CR |
Rev |
Subject/Comment |
Old |
New |
03-2001 |
11 |
|
|
|
Version for Release 4 |
|
4.0.0 |
06-2002 |
16 |
|
|
|
Version for Release 5 |
4.0.0 |
5.0.0 |
12-2004 |
26 |
|
|
|
Version for Release 6 |
5.0.0 |
6.0.0 |
06-2007 |
36 |
|
|
|
Version for Release 7 |
6.0.0 |
7.0.0 |
12-2008 |
42 |
|
|
|
Version for Release 8 |
7.0.0 |
8.0.0 |
Version Control
Version Control
Toto je jediná verze této specifikace.
Download & Access
46055-800
Technical Details
AI Classification
Version Information
Document Info
Keywords & Refs
Partners
File Info
3GPP Spec Explorer - Enhanced specification intelligence