HI-ID (Huffman Initialization ID) — 3GPP Glossary

HI-ID is an identifier used in the 3GPP Short Message Service (SMS) protocol to indicate the Huffman table used for character encoding. It ensures the correct decoding of compressed text in SMS messages, particularly for languages with large character sets.

Description

The Huffman Initialization ID (HI-ID) is a parameter defined within the 3GPP SMS specifications (TS 23.042) for the SMS Compression Protocol. SMS compression is employed to reduce the number of bits needed to represent text, allowing more characters to be sent within a single SMS message segment, especially for languages with large alphabets like Chinese, Japanese, or Korean. The HI-ID is a key component that tells the receiving device which specific Huffman coding table was used by the sender to compress the message.

Huffman coding is a lossless data compression algorithm that uses variable-length codes to represent characters. The efficiency of the compression depends on the frequency distribution of characters in the language. Therefore, different predefined Huffman tables are optimized for different languages or character sets (e.g., a table for Basic Latin, another for Japanese Kanji). During SMS message assembly, the sending entity selects the appropriate table based on the message's language, compresses the text, and includes the HI-ID in the SMS User Data Header to identify the table used.

Upon receipt, the mobile device reads the HI-ID from the message header. It then uses this identifier to select the exact same Huffman decoding table from its local memory. Using the correct table is essential; applying a different table would result in garbled, incorrectly decoded text. The HI-ID mechanism ensures interoperability between different handsets and network elements by providing a standardized reference to a common set of predefined tables. The HI-ID itself is a relatively small field within the protocol, but its correct interpretation is critical for the successful decompression and display of the message content to the end user.

Purpose & Motivation

The HI-ID was introduced to solve the problem of efficiently transmitting text messages in languages that require a large number of characters to be represented. Early SMS was designed primarily for Latin alphabets with limited character sets. As SMS usage expanded globally, sending a single message in languages like Chinese or Japanese could require multiple SMS segments due to the 7-bit or 8-bit default encoding limitations, increasing cost and user inconvenience.

The SMS Compression Protocol, including the HI-ID, was developed to address this. By using Huffman compression optimized for specific language tables, a message could contain more characters per segment. The HI-ID itself solves the problem of identifying which compression table was used. Without such an identifier, the receiver would have no way to know how to decompress the data, rendering the compressed message useless. Its creation was motivated by the need for a lightweight, standardized signaling mechanism within the SMS protocol to enable reliable cross-vendor and cross-network interoperability for compressed SMS, fostering the global use of SMS beyond simple Latin text.

Key Features

Identifier field within the SMS User Data Header for compression protocol
Signals which predefined Huffman coding table was used for compression
Enables correct decompression of the SMS text by the receiving device
Supports interoperability between different handsets and networks
Essential for efficient SMS transmission in languages with large character sets
Defined as part of the 3GPP SMS Compression Protocol (TS 23.042)

Evolution Across Releases

R99 Initial

The Huffman Initialization ID (HI-ID) was initially defined as part of the SMS Compression Protocol introduced in 3GPP Release 99 (TS 23.042). It established the mechanism for identifying the Huffman table used to compress the SMS user data, enabling support for compressed messages in various languages.

TS 23.042

Defining Specifications

Specification	Title
TS 23.042	3GPP TS 23.042