ISAR Pdoc on Testing Aspects for Phase/Track 2/a
Specification: 26996
Summary
This document describes test plan aspects for the selection testing of IVAS specific ISAR solutions targeted in Phase/Track 2/a of the ISAR work.
Specification Intelligence
This is a Technical Document in the Unknown Series series, focusing on Technical Document. The document is currently in approved by tsg and under change control and is under formal change control.
Classification
Specifics
Version
Full Document vi10
Source: Rapporteur[1]
Title: ISAR Pdoc on Testing Aspects for Phase/Track 2/a
Version: v1.0.0
Document for: Agreement
Agenda Item: 7.7, 14.3
Document History:
v0.1.0 | 2 February 2024 | Initial version (high-level) |
v0.1.1 | 1 March 2024 | First detailed version for presentation at Audio SWG call, 1 March 2024 |
v0.1.2 | 1 March 2024 | Edits done by Audio SWG, agreed as basis for further work |
11 March 2024 | Edits related to only a single solution proponent remaining. | |
v0.1.4 | 18 March 2024 | Addition of Annexes D and E. |
- Introduction
This Permanent Document describes test plan aspects for the selection testing of IVAS specific ISAR solutions targeted in Phase/Track 2/a of the ISAR work [1]. It covers the organization of the selection tests and the relevant processing and test plan aspects.
- References, Conventions, and Contacts
- Related ISAR Documents
The following documents provide additional information on the selection of IVAS specific ISAR solutions targeted in Phase/Track 2/a .
Document | Title |
TR 26.865 | Immersive Audio for Split Rendering Scenarios; Requirements |
S4-240403 | Work Plan for ISAR |
This document | Pdoc: Testing Aspects for Phase/Track 2/a |
S4-240398 | Pdoc: Selection Deliverables for Phase/Track 2/a |
S4-240397 | Pdoc: Selection Rules for IVAS Specific ISAR Solutions |
[1] Tdoc S4-240403: Work Plan for the ISAR v0.5.0
[2] Tdoc S4-240254: Trajectory Nullification for Binaural Renderer Evaluation, Fraunhofer IIS
[3] Tdoc S4aA230086: IVAS Permanent Document IVAS-8a: Test Plan for Selection Phase, v.1.1.0
[4] Recommendation ITU-R BS.1534 (10/2015): Method for the subjective assessment of intermediate quality level of audio systems.
[5] 3GPP TR 26.865: Immersive Audio for Split Rendering Scenarios; Requirements
[6] Recommendation ITU-R BS.1770-4 (10/2015): Algorithms to measure audio programme loudness and true-peak audio level.
[7] Tdoc S4aA230087: IVAS Permanent Document IVAS-7a: Processing plan for selection phase, v.1.1.0
[8] 3GPP TS 26.258: Codec for Immersive Voice and Audio Services; C code (floating-point)
[9] Audio File Format Specifications: WAVE, https://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html.
- Key Acronyms
CLL Cross-check Listening Laboratory
CuT Codec under Test
FB Full Band
HOA3 Higher-Order Ambisonics, 3rd order
IVAS Immersive Voice and Audio Services
ISM Independent Streams with Metadata
LL Listening Laboratory
MASA Metadata-Assisted Spatial Audio
MUSHRA Multi Stimulus test with Hidden Reference and Anchor
PCO Processing Cross-check Organization
SP Solution Proponent
SPL Solution Proponent Laboratory
TC Transport Channel
The selection tests of the IVAS specific ISAR solution will be organized as in-house tests by Solution Proponents Laboratories. The execution of these subjective tests is under the responsibility of the solution proponents participating in the selection and other volunteering organizations.
The selection test experiments will be duplicated and additionally run by suitable cross-checkers, Cross-check Listening Laboratories, with no stake in the candidate solutions under test.
The selection test results will be reported by the testing organizations to SA4 with suggested statistical result analysis. SA4 will review these analyses and, if found valid, confirm them to base its selection decision on them.
The processing of the selection test material is under the responsibility of the solution proponents participating in the selection. It is based on commonly available processing scripts, solution candidate executables, original sound material and head-tracker trajectories available to the solution proponents and other organizations who volunteer to carry out cross-checks.
The specific responsibilities of the SP are:
- Make executables of solution candidate publicly available.
- Develop common processing scripts using the condition lists defined in this document and the processing steps defined in the processing plan.
- Process the test material using commonly available processing scripts, the shared solution candidate executables, original sound material and head-tracker trajectories.
- Communicate with Volunteering Processing Cross-check Organizations to verify correct processing.
- Solution Proponents Laboratory (SPL)
- Carry out selection tests according to the requirements of this test plan.
- Carry out statistical result analysis according to the requirements of this test plan and provide test report including analysis to SA4.
- Obligations as SPL:
- The testing shall be caried out in a blinded fashion not revealing the conditions to the subjects.
- No test subjects must be used that were actively involved in developing the split rendering features in the systems under test that are exposed by the experiments.
- The test report shall describe how the listening lab ensured unbiased testing.
- Volunteering Cross-check Listening Laboratories (CLL)
- Carry out selection tests according to the requirements of this test plan.
- Carry out statistical result analysis according to the requirements of this test plan and provide test report including analysis to SA4.
- Obligations as CLL:
- The testing shall be caried out in a blinded fashion not revealing the conditions to the subjects.
- The CLL shall not be contributor of the split rendering features in the systems under test that are exposed by the experiments.
- The test report shall contain a statement confirming that the listening lab has met the obligations.
- Listening Laboratories (LL) (both SPL and CLL)
- Provide a listening environment meeting the listening conditions for BS.1534 testing [4].
- Volunteering Processing Cross-check Organizations (PCO)
- Process the test material using commonly available processing scripts, the shared solution candidate executables, original sound material and head-tracker trajectories.
- Communicate with Solution Proponent to verify correct processing.
- SA4
- Review selection test analyses received from LLs and determine their validity.
- Selection of Candidate Solution according to selection rules for IVAS specific ISAR solutions targeted in Phase/Track 2/a of the ISAR work [1].
The statistical result analysis reports shall present the results of the Terms of Reference (ToR) tests using Student’s Dependent Groups t-test (single-sided at 95% confidence level). Results of the Requirement ToR tests for each experiment shall be presented containing all relevant data allowing to verify proper execution of the Student’s Dependent Groups t-test.
In the for Requirement ToR tests this should lead to the following indications:
- Requirement ToR tests that are passed, (i.e., CuT “not worse than” Requirement) are indicated by CuT NWT Ref.
- Requirement ToR tests that are exceeded, (i.e., CuT “better than” Requirement) are indicated by CuT BT Ref.
- Requirement ToR tests that are failed (i.e., CuT “worse than” Requirement) are indicated by CuT WT Ref.
In the following, the identities of the involved organizations are listed:
- SP (proponent of CuT):
- Proponent companies:
Dolby Sweden AB, Ericsson LM, Fraunhofer IIS, Nokia Corporation, NTT, Orange, Panasonic Holdings Corporation, Philips International B.V., Qualcomm Incorporated, VoiceAge Corporation - Main contributor(s) to CuT whose split rendering features are exposed in the experiments:
Dolby Sweden AB, Fraunhofer IIS
- Proponent companies:
- SPLs:
- Dolby Sweden AB, Fraunhofer IIS
- CLLs:
- Qualcomm, Nokia, Bytedance, Ittiam
- Processing Cross-check Organizations (PCO)
- Fraunhofer IIS, Dolby
- LL assignment
Experiment | SPL | CCL/other SPL |
BS1534-1: SBA (HOA3) | Dolby | Qualcomm |
BS1534-2: Multi-channel 7.1+4 | Fraunhofer IIS | Ittiam |
BS1534-3: Objects | Fraunhofer IIS | Nokia |
BS1534-4: MASA | Dolby | Bytedance |
Any and all deviations from the specifications contained in this document must be documented and submitted to SA4 along with the test reports.
- General Consideration of Experiments
- Difference scenario between assumed and actual end-device poses
- General Consideration of Experiments
For the evaluation of ISAR split rendering solution, a primary focus should be the testing with relevant difference scenarios between assumed and actual end-device poses. To cover relevant cases, the used head-tracker trajectory files should be taken from the following categories:
- Static within range: +-20 degrees
- Dynamic within range: +-20 degrees
- Sinusoidal: 0.25 Hz
- Triangular: 0.5 Hz
- Real, i.e., derived from real head tracker trajectories with movements giving rise to substantial differences (>15 degrees) between assumed and actual end-device poses and exposing the tested methods to a sufficient degree.
- DOF
Another aspect of interest is the ability to deal with differences of assumed and actual end-device poses around different axes, i.e., the number of degrees of freedom (1-3) which the candidate solutions can cope with. Accordingly, the head-tracker trajectory files shall cover the following DOF scenarios:
- 1-DOF with pose deviations in yaw
- 2-DOF with pose deviations in yaw and pitch
- 3-DOF with pose deviations in yaw, pitch and roll
- Rendering simulation
Two different rendering simulation methodologies shall be covered in the tests.
- Trajectory nullification [2]
This simulation methodology is based on the concept that an immersive audio scene is pre-rotated prior to IVAS encoding while the head-tracked rendering compensates for the pre-rotation. In the ideal case and under certain conditions, this compensation can be perfect. This simulation methodology exposes the ability of a split rendering system to compensate for the pre-rotation despite the differences between assumed and actual end-device poses. - Unguided end-device pose
This simulation methodology is based on rendering a decoded an immersive audio scene according to a given head-tracker trajectory. This simulation methodology exposes the ability of a split rendering system to follow the actual head-tracker trajectory despite the mere availability of the divergent assumed head-tracker trajectory available at the pre-renderer.- Input formats
According to the ISAR requirements, the tests shall cover the following IVAS codec input formats:
BS.1534 test methodologies shall be used in the ISAR selection tests. High-level configuration of the experiments is outlined below.
- BS.1534
- Number of items per experiment: 12
- 10 experienced listeners
- Total number of conditions: 4
- Number of anchor conditions: 2
- Native reference system
- 7 kHz low-pass anchor
- Head-tracker trajectories
All head-tracker trajectory files shall follow the convention imposed by the IVAS specifications [8], i.e., shall be usable by the IVAS decoder/renderer.
- Head-tracker trajectory categories and DOF
Head-tracker trajectories shall meet the above-defined category and DOF specifications.
- Head-tracker trajectory availability and selection
Head-tracker trajectories of the above-defined categories and DOF will be publicly collected from volunteering organizations. After collection and checking suitability, a list of available trajectories will be generated. The collection and checking will be done jointly by the involved organizations of the ISAR selection, i.e., the SP, the LLs and the PCOs.
In a second step, these organizations will jointly select up to 6 suitable trajectories for each test. In case this number of trajectories is not available, a smaller number is selected where a given trajectory may be reused across different tests or within a test.
The involved organizations will document their trajectory selection and assignment to tests for inclusion of this information into this document.
All audio material shall be sampled at 48 kHz with Full Band (FB) content and formatted as 16-bit little endian WAVE format files [9].
- Audio categories
To cover a broad range of conceivable audio categories, the test items should be taken from the categories clean and noisy speech, music, critical audio. However, the ability of the system to deal with different audio categories is only a secondary focus of the selections tests and full coverage of these categories may not be possible.
- Test Item availability and selection
Audio material of the above categories has been collected as part of the IVAS codec selection phase. Details of this material are available in the IVAS test plan [3]. A subset of this material is either publicly available or at least available to the involved organizations of the ISAR selection, i.e., the SP, the LLs and the PCOs. These organizations will in a first step create a list of commonly available test items.
In a second step, these organizations will jointly select up to 12 suitable original test items for each test. In case this number of test items is not available, a smaller number is selected where a given test item may be reused across different tests or even within a test if the applied head-tracker trajectories, DOF or simulation methodology is different.
The involved organizations will document their test item selection and assignment to tests for inclusion of this information into this document.
No dedicated training material will be made available for use in a potential training phase in which the subjects may familiarize with the testing methodology and environment.
Such a training phase is voluntary and upon the own responsibility of the involved LLs. No items from the main tests shall be used for training. A training phase shall be executed as a separate short BS.1534 session.
The ISAR Selection Test will use the following listening systems:
- High-quality stereo headphones for binaural listening, e.g.:
- Sennheiser HD 650
The purpose of the 4 experiments (Experiments BS1534-1 – BS1534-4) is to evaluate the performances of the IVAS specific ISAR solution candidate with respect to the performance requirements and objectives defined in ISAR TR 26.865 [5].
The details provided in this section and in corresponding Annexes are those that are specific to each particular experiment. Generic information can be found in Section 4. Therefore, the LLs should use the information in Section 4 in conjunction with the information given in this section and Annexes.
Table 4 shows high-level overview of the experiments.
Detailed conditions for each subjective experiment are defined in Annex B.
Table : High-level overview of ISAR selection experiments
Exp | Input format | Source material | Listening environment | Bitrates kbps |
BS1534-1 | SBA (HOA3) | Generic Audio | Headphones | IVAS: 512, CuT: 768 |
BS1534-2 | Multi-channel 7.1+4 | Generic Audio | Headphones | IVAS: 512, CuT: 768 |
BS1534-3 | Objects (ISM-4) | Generic Audio | Headphones | IVAS: 512, CuT: 768 |
BS1534-4 | MASA (2 TC) | Generic Audio | Headphones | IVAS: 512, CuT: 768 |
- Selection Testing Timeline
Table A.1: Testing Timeline
Month | Meeting/date | Task | Active Parties |
March-2024 | March 1 | 3GPP SA4 Audio SWG telco | Audio SWG |
March 8 | Collection, checking and selection of head-tracker trajectories and assignment to tests completed | SP, LLs, PCOs | |
March 8 | Identification and selection of test items and assignment to tests completed | SP, LLs, PCOs | |
March 11 | 3GPP SA4 Audio SWG telco Provision of final draft of processing scripts incl. command lines Confirm LL assignment | SP, PCOs | |
March 15 | Selection test processing completed including cross-checking | SP, LLs, PCOs | |
March 18 | 3GPP SA4 Audio SWG telco | Audio SWG | |
March 19 | Start of listening tests | LLs | |
April-2024 | April 2 | Delivery of LL reports | LLs |
April 8-12 | 3GPP SA4 meeting #127bis-e – IVAS specific ISAR solution Selection meeting | SA4 |
Table B.1.1: Conditions (BS1534-1 Generic Audio)
Main Conditions | |
Condition under Test (CuT) | IVAS operated with HOA3 audio input at 512 kbps rendered to pre-renderer pose, pose corrected to post renderer pose with ISAR candidate operated at 768 kbps |
References (Hidden and open) | |
Reference | (Native coding system) IVAS operated with HOA3 audio input at 512 kbps rendered to post-renderer pose |
Other references | |
LP7 anchor | 7 kHz lowpass filtered reference, nominal level |
0-DOF native transcoding reference | IVAS operated with HOA3 audio input at 512 kbps rendered to pre-renderer pose, binaural output transcoded with IVAS stereo coded@256kbps |
Common Conditions | |
Test item generation | According to material collection procedure for ISAR selection BS.1534 tests. |
Audio sampling frequency/bandwidth | 48 kHz/FB |
Input frequency mask | 20KBP |
Nominal output loudness | -26 LKFS [6] |
Listening Level | Adjusted by listener |
Listeners | Experienced Listeners |
Randomizations | Individual per listeners |
Rating Scale | Continuous BS.1534 scale from 0-100 [4] |
Listening System | High-quality headphone for diotic presentation, in accordance with clause 4.6 |
Listening Environment | No room noise |
Table B.1.2: Test conditions for Experiment BS1534-1
Label | Condition | Bitrate [kbps] | ToR |
c01 | Reference | 512 | - |
c02 | LP7 anchor | 512 | - |
c03 | 0-DOF | 512, 256 | - |
c04 | CuT1 | 512, 768 | NWT c03 |
- Experiment BS1534-2: Multi-channel 7.1+4
Table B.2.1: Conditions (BS1534-2 Generic Audio)
Main Conditions | |
Condition under Test (CuT) | IVAS operated with 7.1+4 audio input at 512 kbps rendered to pre-renderer pose, pose corrected to post renderer pose with ISAR candidate operated at 768 kbps |
References (Hidden and open) | |
Reference | (Native coding system) IVAS operated with 7.1+4 audio input at 512 kbps rendered to post-renderer pose |
Other references | |
LP7 anchor | 7 kHz lowpass filtered reference, nominal level |
0-DOF native transcoding reference | IVAS operated with 7.1+4 audio input at 512 kbps rendered to pre-renderer pose, binaural output transcoded with IVAS stereo coded@256kbps |
Common Conditions | |
Test item generation | According to material collection procedure for ISAR selection BS.1534 tests. |
Audio sampling frequency/bandwidth | 48 kHz/FB |
Input frequency mask | 20KBP |
Nominal output loudness | -26 LKFS [6] |
Listening Level | Adjusted by listener |
Listeners | Experienced Listeners |
Randomizations | Individual per listeners |
Rating Scale | Continuous BS.1534 scale from 0-100 [4] |
Listening System | High-quality headphone for diotic presentation, in accordance with clause 4.6 |
Listening Environment | No room noise |
Table B.2.2: Test conditions for Experiment BS1534-2
Label | Condition | Bitrate [kbps] | ToR |
c01 | Reference | 512 | - |
c02 | LP7 anchor | 512 | - |
c03 | 0-DOF | 512, 256 | - |
c04 | CuT1 | 512, 768 | NWT c03 |
- Experiment BS1534-3: Objects
Test with 4 Objects (ISM-4) in BS1534-3 at high bitrate.
Table B.3.1: Conditions (BS1534-3 Generic Audio)
Main Conditions | |
Condition under Test (CuT) | IVAS operated with 4 Objects audio input at 512 kbps rendered to pre-renderer pose, pose corrected to post renderer pose with ISAR candidate operated at 768 kbps |
References (Hidden and open) | |
Reference | (Native coding system) IVAS operated with 4 Objects audio input at 512 kbps rendered to post-renderer pose |
Other references | |
LP7 anchor | 7 kHz lowpass filtered reference, nominal level |
0-DOF native transcoding reference | IVAS operated with 4 Objects audio input at 512 kbps rendered to pre-renderer pose, binaural output transcoded with IVAS stereo coded@256kbps |
Common Conditions | |
Test item generation | According to material collection procedure for ISAR selection BS.1534 tests. |
Audio sampling frequency/bandwidth | 48 kHz/FB |
Input frequency mask | 20KBP |
Nominal output loudness | -26 LKFS [6] |
Listening Level | Adjusted by listener |
Listeners | Experienced Listeners |
Randomizations | Individual per listeners |
Rating Scale | Continuous BS.1534 scale from 0-100 [4] |
Listening System | High-quality headphone for diotic presentation, in accordance with clause 4.6 |
Listening Environment | No room noise |
Table B.3.2: Test conditions for Experiment BS1534-3
Label | Condition | Bitrate [kbps] | ToR |
c01 | Reference | 512 | - |
c02 | LP7 anchor | 512 | - |
c03 | 0-DOF | 512, 256 | - |
c04 | CuT1 | 512, 768 | NWT c03 |
- Experiment BS1534-4: MASA
Table B.4.1: Conditions (BS1534-4 Generic Audio)
Main Conditions | |
Condition under Test (CuT1) | IVAS operated with MASA (2 TC) audio input at 512 kbps rendered to pre-renderer pose, pose corrected to post renderer pose with ISAR candidate operated at 768 kbps |
References (Hidden and open) | |
Reference | (Native coding system) IVAS operated with MASA (2 TC) audio input at 512 kbps rendered to post-renderer pose |
Other references | |
LP7 anchor | 7 kHz lowpass filtered reference, nominal level |
0-DOF native transcoding reference | IVAS operated with MASA (2 TC) audio input at 512 kbps rendered to pre-renderer pose, binaural output transcoded with IVAS stereo coded@256kbps |
Common Conditions | |
Test item generation | According to material collection procedure for ISAR selection BS.1534 tests. |
Audio sampling frequency/bandwidth | 48 kHz/FB |
Input frequency mask | 20KBP |
Nominal output loudness | -26 LKFS [6] |
Listening Level | Adjusted by listener |
Listeners | Experienced Listeners |
Randomizations | Individual per listeners |
Rating Scale | Continuous BS.1534 scale from 0-100 [4] |
Listening System | High-quality headphone for diotic presentation, in accordance with clause 4.6 |
Listening Environment | No room noise |
Table B.4.2: Test conditions for Experiment BS1534-4
Label | Condition | Bitrate [kbps] | ToR |
c01 | Reference | 512 | - |
c02 | LP7 anchor | 512 | - |
c03 | 0-DOF | 512, 256 | - |
c04 | CuT1 | 512, 768 | NWT c03 |
- Processing plan
The processing of the test material for the ISAR selection tests is conceptually following the IVAS selection processing [7]. In particular, pre-processing steps and post-processing steps are identical to those described in respective sections 4.3 and 4.10 of [7].
The main processing is described below.
Note: Irrespective of this processing plan, the processing is defined by the publicly available processing scripts [10]. In case of deviations between this plan and the scripts, the implementation of the script will prevail.
- Reference conditions
This condition is the IVAS codec operated with the audio input in the format of the given experiment, encoded at 512 kbps and rendered to post-renderer pose.
Assuming a head-tracker trajectory file at the lightweight device called ‘post.csv’, the respective command lines are:
- Experiment BS.1534-1: SBA
ivas_cod.exe -sba 3 512000 48 hoa3.wav enc_out.pkt
ivas_dec.exe -t post.csv BINAURAL 48 enc_out.pkt ref.wav
- Experiment BS.1534-2: Multi-channel 7.1+4
ivas_cod.exe -mc 7_1_4 512000 48 mc714.wav enc_out.pkt
ivas_dec.exe -t post.csv BINAURAL 48 enc_out.pkt ref.wav
- Experiment Experiment BS.1534-3: Objects
ivas_cod.exe -ism 4 ism_Obj1.csv ism_Obj2.csv ism_Obj3.csv ism_Obj4.csv 512000 48 ism4.wav enc_out.pkt
ivas_dec.exe -t post.csv BINAURAL 48 enc_out.pkt ref.wav
- Experiment Experiment BS.1534-4: MASA
ivas_cod.exe -masa 2 masa.metadata 512000 48 masa2ch.wav enc_out.pkt
ivas_dec.exe -t post.csv BINAURAL 48 enc_out.pkt ref.wav
- 7 kHz anchor conditions
The 7 kHz anchors are obtained by 7 kHz lowpass filtering of the processed reference conditions before post processing.
- 0-DOF native transcoding reference conditions
As under C.1 with the difference that the IVAS decoder command lines are replaced by
ivas_dec.exe -t pre.csv BINAURAL 48 enc_out.pkt out_0dof.wav
ivas_cod.exe -stereo 256000 48 out_0dof.wav stereo_0dof.pkt
ivas_dec.exe STEREO 48 stereo_0dof.pkt native_ref_0dof.wav
Note: the used head-tracker trajectory file ‘pre.csv’ is the assumed head-tracker pose at the pre-renderer.
- CuT conditions
As under C.1 with the difference that the IVAS decoder command lines are replaced by
IVAS decoder (in split pre rendering mode):
ivas_dec.exe -render_config split_renderer_config_768_3dof_cldfbpc.txt -t pre.csv BINAURAL_SPLIT_CODED 48 enc_out.pkt split_out.pkt
IVAS ext renderer(in split post rendering mode):
ivas_rend.exe -i split_out.pkt -if BINAURAL_SPLIT_CODED -o out.wav -of BINAURAL -fs 48 -t post.csv
Note: The two SPs are expected to coordinate about how to use these command lines and may agree on modifications.
- Report on ISAR Item/Trajectory Collection, Selection and Verification
- Introduction
According to this ISAR Pdoc on Testing Aspects for Phase/Track 2/a, item and trajectory collection should be completed by March 8, 2024, and followed by selection and verification.
Based on the collected material, the following selection and verification actions were carried out.
- Selection guiding principles:
- Item selection:
- Include as many suitable items as possible.
- Exclusion criteria:
- Defective items, e.g., clipped items
- Too diffuse or too chaotic such that effect of head-tracking is hardly perceivable.
- Trajectory selection:
- Include as many trajectories as possible.
- Exclusion criteria:
- Uncompliant with test plan requirements, e.g. pose deviation (between pre- and post-renderer) exceeding range of +-20 degrees in case of synthetic trajectories
- Combined item and trajectory selection
- Item and trajectory in combination should expose substantial pose deviations (between pre- and post-renderer) during audio activity.
- Collected items and trajectories:
Sound items (received by deadline)
- SBA
- Dolby – 5 items
- MC
- Dolby – 5 items
- Nokia – 3 items
- ISM
- Dolby – 5 items
- Ericsson – 4 items
- FhG – 4 items
- Nokia – 3 items
- MASA
- Same as ISM whereby the ISM items are converted to HOA2 and used as an input to MASA analyzer.
Sound items (received after deadline):
- ISM
- Bytedance – 2
Trajectories (received by deadline):
- Static
- Dolby – 2
- Sinusoidal
- Dolby – 1
- Triangular
- Dolby – 3
- Nokia – 8
- Real
- FhG – 3
Trajectories (received after deadline):
- Bytedance - 2
- Selection proposal:
Applying the selection guiding principles result in the following selection proposal.
Items:
Specifically, the items that were available by the deadline were checked to be not too diffuse or chaotic. Excess items in one format were re-formatted to be available in another format where too few items were available.
- ISM: (available by deadline)
- 5 Dolby items (which includes two items from forge that were shared by Nokia during OMASA development)
- 4 FhG shared items
- 2 Ericsson shared items
- 1 Nokia shared items
- MASA: (available by deadline, generated after deadline)
- 11 ISM items have been converted to HOA2 and they are used as an input to MASA analyzer:
- 3 Dolby items (which includes two items from forge that were shared by Nokia during OMASA development)
- 4 FhG shared items
- 3 Ericsson shared items
- 1 Nokia shared items
- 1 HOA3 item has been converted to HOA2 and they are used as an input to MASA analyzer:
- 1 Dolby item
- 11 ISM items have been converted to HOA2 and they are used as an input to MASA analyzer:
- MC: (available by deadline)
- 5 Dolby items
- 3 Nokia items
- SBA: (available by deadline, partly generated after deadline))
- 5 Dolby items
- 2 Ericsson ISM4 converted to HOA3
- 1 FhG ISM4 converted to HOA3
Trajectories:
Specifically, the trajectories that were available by the deadline were checked to be in the required range. The real trajectories were selected to be perceptually in a similar range as the synthetic trajectories.
Head trajectory # | pre | post | Nullification |
1 | Static 20 | 0 | Yes |
2 | Static -20 | 0 | Yes |
3 | Sinusoidal (0.25 Hz) | 0 | Yes |
4 | Triangular (0.5 Hz) | 0 | Yes |
5 | Triangular (0.5 Hz) yaw + static 20 pitch | 0 | Yes |
6 | Triangular (0.5 Hz) yaw + static 20 roll | 0 | Yes |
7 | Triangular (~0.5 Hz) 1D 20 | -10 | No |
8 | Real #3 | 0 | Yes |
Suggested combination:
After verification, the combination of sound items and trajectories was implemented as part of the ISAR processing scripts.
- Conclusion
This report summarizes the actions made on item and trajectory collection, selection and verification. Annex E displays the implementation of the combination of selected sound items and trajectories.
- Implementation of combination of sound items and trajectories
Note: the footnotes below explain the changes made as result of the verification of the combination of sound items and trajectories.
- ISM
File names mapping:
### Selected audio items mapped to a tag ####
audio files mapping =
[
('ism4_1yaw_20.wav', 'ism4_1'),
('ism4_2sin_t20.wav', 'ism4_2'),
('ism4_3tri_t20p20.wav', 'ism4_3'),
('ism4_4sin_t20.wav', 'ism4_4'),
('ism4_5tri_t20p20.wav', 'ism4_5'),
('Ericsson_Birthday.wav', 'ism4_6'),
('Ericsson_BouncingBall.wav', 'ism4_7'),
('Fhg_ISM4_01/Fhg_ISM4_01.wav', 'ism4_8'),
('Fhg_ISM4_02/Fhg_ISM4_02.wav', 'ism4_9'),
('Fhg_ISM4_03/Fhg_ISM4_03.wav', 'ism4_10'),
('Fhg_ISM4_04/Fhg_ISM4_04.wav', 'ism4_11'),
('Nokia_ArtoMusaMix6.wav', 'ism4_12'),
];
### Selected head trajectories mapped to a tag ####
head trajectory mapping = [
('pre_static_20_0_0.csv','post_static_0_0_0.csv', 'traj1'),
('pre_static_-20_0_0.csv','post_static_0_0_0.csv', 'traj2'),
('sin_traj20.csv','post_static_0_0_0.csv', 'traj3'),
('tri_traj20.csv','post_static_0_0_0.csv', 'traj4'),
('tri_traj20_p20.csv','post_static_0_0_0.csv', 'traj5'),
('tri_traj20_r20.csv','post_static_0_0_0.csv', 'traj6'),
('01_pre_1D_20triangular.csv','01_post_1D_20triangular.csv', 'traj7'),
('trajectory_real_fhg_3.csv','post_static_0_0_0.csv', 'traj8'),
];
### BS.1534 test items (audio tag + trajectory tag) ####
test items =
[
('ism4_1traj6'),
('ism4_2traj1'),
('ism4_3traj4'),
('ism4_4traj1'),
('ism4_5traj5'),
('ism4_6traj8'),
('ism4_7traj7'),
('ism4_8traj2'),
('ism4_9traj7'),
('ism4_10traj5'),
('ism4_11traj3'),
('ism4_12traj3')
];
- MASA
File names mapping:
### MASA inputs are HOA2 files that are generated from ISM4 inputs ###
### Selected audio items mapped to a tag ####
audio files mapping =
[
('dlb_ism4_1_HOA2.wav', 'ism4_1_HOA2'),
('Ericsson_pie_HOA2.wav', 'ism4_2_HOA2'), ##origin ISM4
('HOA2S_FA1MB1_Dry_Render.wav', 'HOA3_3_HOA2'), ##origin HOA3 [2]
('dlb_ism4_4_HOA2.wav', 'ism4_4_HOA2'),
('dlb_ism4_5_HOA2.wav', 'ism4_5_HOA2'),
('Ericsson_Birthday_HOA2.wav', 'ism4_6_HOA2'),
('Ericsson_BouncingBall_HOA2.wav', 'ism4_7_HOA2'),
('Fhg_ISM4_01_HOA2.wav', 'ism4_8_HOA2'),
('Fhg_ISM4_02_HOA2.wav', 'ism4_9_HOA2'),
('Fhg_ISM4_03_HOA2.wav', 'ism4_10_HOA2'),
('Fhg_ISM4_04_HOA2.wav', 'ism4_11_HOA2'),
('Nokia_ArtoMusaMix6_HOA2.wav', 'ism4_12_HOA2'),
];
### Selected head trajectories mapped to a tag ####
head trajectory mapping = [
('pre_static_20_0_0.csv','post_static_0_0_0.csv', 'traj1'),
('pre_static_-20_0_0.csv','post_static_0_0_0.csv', 'traj2'),
('sin_traj20.csv','post_static_0_0_0.csv', 'traj3'),
('tri_traj20.csv','post_static_0_0_0.csv', 'traj4'),
('tri_traj20_p20.csv','post_static_0_0_0.csv', 'traj5'),
('tri_traj20_r20.csv','post_static_0_0_0.csv', 'traj6'),
('01_pre_1D_20triangular.csv','01_post_1D_20triangular.csv', 'traj7'),
('trajectory_real_fhg_3.csv','post_static_0_0_0.csv', 'traj8'),
];
### BS.1534 test items (audio tag + trajectory tag) ####
test items =
[
('ism4_1_HOA2traj6'),
('ism4_2_HOA2traj2'),
('HOA3_3_HOA2traj4'),
('ism4_4_HOA2traj1'),
('ism4_5_HOA2traj5'),
('ism4_6_HOA2traj8'),
('ism4_7_HOA2traj7'),
('ism4_8_HOA2traj2'),
('ism4_9_HOA2traj7'),
('ism4_10_HOA2traj5'),
('ism4_11_HOA2traj3'),
('ism4_12_HOA2traj3')
];
- MC
### Selected audio items mapped to a tag ####
audio files mapping =
[
('HOA3S_FA1MB1_Dry_Render_CICP19.wav', 'mc714_1'),
('Shattered_hpmd_clip1_1_CICP19.wav', 'mc714_2'),
('Shattered_hpmd_clip1_2_CICP19.wav', 'mc714_3'),[3]
('SayGoodbye_noTrims_1_CICP19.wav', 'mc714_4'),
('VeryOwnHeaven_noTrims_1_CICP19.wav', 'mc714_5'),
('Nokia_ArtoMix1_MC714.wav', 'mc714_6'),
('Nokia_ArtoMix2_MC714.wav', 'mc714_7'),
('Nokia_ArtoMix3_MC714.wav', 'mc714_8'),
];
### Selected head trajectories mapped to a tag ####
head trajectory mapping = [
('pre_static_20_0_0.csv','post_static_0_0_0.csv', 'traj1'),
('pre_static_-20_0_0.csv','post_static_0_0_0.csv', 'traj2'),
('sin_traj20.csv','post_static_0_0_0.csv', 'traj3'),
('tri_traj20.csv','post_static_0_0_0.csv', 'traj4'),
('tri_traj20_p20.csv','post_static_0_0_0.csv', 'traj5'),
('tri_traj20_r20.csv','post_static_0_0_0.csv', 'traj6'),
('01_pre_1D_20triangular.csv','01_post_1D_20triangular.csv', 'traj7'),
('trajectory_real_fhg_3.csv','post_static_0_0_0.csv', 'traj8'),
];
### BS.1534 test items (audio tag + trajectory tag) ####
test items =
[
('mc714_1traj1'),
('mc714_1traj2'),
('mc714_2traj3'),
('mc714_2traj4'),
('mc714_3traj1'),
('mc714_3traj5'),
('mc714_4traj6'),
('mc714_5traj5'),
('mc714_5traj6'),
('mc714_6traj7'),
('mc714_7traj7'),
('mc714_8traj8')
];
- SBA
File names mapping:
### Selected audio items mapped to a tag ####
audio files mapping =
[
('HOA3S_FA1MB1_Dry_Render.wav', 'sbahoa3_1'),
('Shattered_hpmd_clip1_1_HOA3S.wav', 'sbahoa3_2'),
('Shattered_hpmd_clip1_2_HOA3S.wav', 'sbahoa3_3'),
('SayGoodbye_noTrims_1_HOA3S.wav', 'sbahoa3_4'),
('VeryOwnHeaven_noTrims_1_HOA3S.wav', 'sbahoa3_5'),
('Ericsson_Birthday_HOA3.wav', 'sbahoa3_6'),
('Ericsson_BouncingBall_HOA3.wav', 'sbahoa3_7'),
('Fhg_ISM4_02_HOA3.wav', 'sbahoa3_8'),
];
### Selected head trajectories mapped to a tag ####
head trajectory mapping = [
('pre_static_20_0_0.csv','post_static_0_0_0.csv', 'traj1'),
('pre_static_-20_0_0.csv','post_static_0_0_0.csv', 'traj2'),
('sin_traj20.csv','post_static_0_0_0.csv', 'traj3'),
('tri_traj20.csv','post_static_0_0_0.csv', 'traj4'),
('tri_traj20_p20.csv','post_static_0_0_0.csv', 'traj5'),
('tri_traj20_r20.csv','post_static_0_0_0.csv', 'traj6'),
('01_pre_1D_20triangular.csv','01_post_1D_20triangular.csv', 'traj7'),
('trajectory_real_fhg_3.csv','post_static_0_0_0.csv', 'traj8'),
];
### BS.1534 test items (audio tag + trajectory tag) ####
test items =
[
('sbahoa3_1traj1'),
('sbahoa3_1traj2'),
('sbahoa3_2traj3'),
('sbahoa3_2traj4'),
('sbahoa3_3traj1'),
('sbahoa3_3traj5'),
('sbahoa3_4traj6'),
('sbahoa3_5traj5'),
('sbahoa3_5traj6'),
('sbahoa3_6traj7'),
('sbahoa3_7traj7'),
('sbahoa3_8traj8')
];
Stefan Bruhn, Dolby Sweden AB; email: [email protected] ↑
Verification listening revealed a quality issue of the IVAS codec for the originally proposed sound items, likely caused by a bug. Consequently, the original items 'dlb_ism4_2_HOA2.wav' and 'dlb_ism4_3_HOA2.wav' were found unusable and were replaced. The trajectory was changed from static 20 degrees to static -20 degrees for balance reasons. ↑
The item 'Shattered_hpmd_clip1_1_CICP19.wav' was used twice in the first list due to a copy-paste mistake. It was replaced by the intended item 'Shattered_hpmd_clip1_2_CICP19.wav'. ↑
Version Control
Version Control
Toto je jediná verze této specifikace.
Download & Access
26996-i10
Technical Details
AI Classification
Version Information
Document Info
Partners
File Info
3GPP Spec Explorer - Enhanced specification intelligence