Description
Vertical Federated Learning (VFL) is a specialized distributed machine learning paradigm standardized by 3GPP to enable collaborative AI model training across different organizations or network domains without centralizing raw, sensitive data. In contrast to horizontal federated learning where participants share the same feature space but different user samples, VFL is characterized by participants holding different features or attributes for the same set of overlapping user IDs. A typical scenario involves a mobile network operator holding radio access network (RAN) measurement data and an Over-The-Top (OTT) service provider holding application-layer quality data for the same subscribers. VFL allows these parties to jointly train a more comprehensive and accurate model—for instance, for predicting user experience—while keeping their respective datasets private and on-premises.
The technical operation of VFL involves a structured protocol with roles such as the guest party, host party(s), and potentially a coordinator. The process begins with privacy-preserving entity alignment, where the participating parties use cryptographic techniques like Private Set Intersection (PSI) to securely identify their common users without revealing non-overlapping IDs. Once the aligned user set is established, the collaborative training commences. A common architecture splits the model into a bottom model and a top model. Each party trains its own bottom model on its local feature set. The outputs (embeddings or intermediate results) from these bottom models are then securely aggregated, often via homomorphic encryption or secure multi-party computation (MPC), to compute the loss and gradients for the top model. These gradients are distributed back to each party to update their respective bottom models, all without any party seeing the raw features or labels of another.
Key components in the 3GPP VFL framework include the Network Data Analytics Function (NWDAF) which can act as a participant or coordinator, standardized interfaces for federated learning orchestration (e.g., Naf_FederatedLearning), and security protocols for secure aggregation and model exchange. The architecture is designed to integrate with the 5G Service-Based Architecture (SBA), allowing network functions like the AMF, SMF, and PCF to contribute data to federated learning processes. VFL's role is to unlock the value of partitioned data silos within the telecom ecosystem, enabling advanced AI/ML use cases such as joint network-service optimization, churn prediction, and personalized QoS management, while strictly adhering to data privacy regulations like GDPR.
Purpose & Motivation
VFL was introduced to address the critical challenge of data silos and privacy constraints that hinder the development of advanced AI-driven network and service management. In the telecom industry, valuable data is fragmented across operators, vendors, and service providers. For example, an operator has detailed network performance data, while a content provider has rich application behavior data. Individually, these datasets provide a limited view; combined, they could power highly accurate predictive models. However, legal, regulatory, and competitive barriers prevent the sharing or centralization of this raw data. Traditional methods of data pooling or model training on centralized datasets are thus infeasible, limiting the potential of AI in 5G and beyond.
The standardization of VFL in 3GPP Release 19 was motivated by the need to foster a trusted data collaboration ecosystem for 6G preparation and advanced 5G-Advanced networks. It solves the problem by providing a standardized, secure framework for collaborative learning that preserves data sovereignty. This enables participants to benefit from the combined predictive power of distributed feature sets while providing technical and procedural guarantees that raw data never leaves its owner's control. VFL unlocks new business models and operational efficiencies, such as co-developing churn prediction models with banking partners or optimizing video streaming jointly with content delivery networks, all within a privacy-by-design framework that builds trust among stakeholders.
Key Features
- Privacy-preserving entity alignment using techniques like Private Set Intersection (PSI)
- Split-model architecture with local bottom models and a collaboratively trained top model
- Secure gradient and loss computation using homomorphic encryption or secure multi-party computation
- Integration with 5G Service-Based Architecture and the NWDAF for network analytics use cases
- Standardized interfaces and procedures for federated learning orchestration and lifecycle management
- Support for heterogeneous data owners with vertically partitioned feature spaces
Evolution Across Releases
Initial standardization of Vertical Federated Learning architecture, procedures, and security requirements. Specifications defined the functional roles, reference architecture, and interfaces for VFL within the 5G system, enabling privacy-preserving collaborative AI model training across network operators and third parties.
Defining Specifications
| Specification | Title |
|---|---|
| TS 21.905 | 3GPP TS 21.905 |
| TS 23.288 | 3GPP TS 23.288 |
| TS 23.482 | 3GPP TS 23.482 |
| TS 23.700 | 3GPP TS 23.700 |
| TS 24.560 | 3GPP TS 24.560 |
| TS 28.105 | 3GPP TS 28.105 |
| TS 28.858 | 3GPP TS 28.858 |
| TS 29.510 | 3GPP TS 29.510 |
| TS 29.520 | 3GPP TS 29.520 |
| TS 29.530 | 3GPP TS 29.530 |
| TS 29.552 | 3GPP TS 29.552 |
| TS 29.591 | 3GPP TS 29.591 |
| TS 33.501 | 3GPP TR 33.501 |
| TS 33.784 | 3GPP TR 33.784 |