VFL

Vertical Federated Learning

Other
Introduced in Rel-19
A privacy-preserving distributed machine learning framework where multiple parties collaboratively train a model using different feature sets from the same set of users. It enables data collaboration across organizational silos without exposing raw data, crucial for network optimization and AI services.

Description

Vertical Federated Learning (VFL) is a specialized distributed machine learning paradigm standardized by 3GPP to enable collaborative AI model training across different organizations or network domains without centralizing raw, sensitive data. In contrast to horizontal federated learning where participants share the same feature space but different user samples, VFL is characterized by participants holding different features or attributes for the same set of overlapping user IDs. A typical scenario involves a mobile network operator holding radio access network (RAN) measurement data and an Over-The-Top (OTT) service provider holding application-layer quality data for the same subscribers. VFL allows these parties to jointly train a more comprehensive and accurate model—for instance, for predicting user experience—while keeping their respective datasets private and on-premises.

The technical operation of VFL involves a structured protocol with roles such as the guest party, host party(s), and potentially a coordinator. The process begins with privacy-preserving entity alignment, where the participating parties use cryptographic techniques like Private Set Intersection (PSI) to securely identify their common users without revealing non-overlapping IDs. Once the aligned user set is established, the collaborative training commences. A common architecture splits the model into a bottom model and a top model. Each party trains its own bottom model on its local feature set. The outputs (embeddings or intermediate results) from these bottom models are then securely aggregated, often via homomorphic encryption or secure multi-party computation (MPC), to compute the loss and gradients for the top model. These gradients are distributed back to each party to update their respective bottom models, all without any party seeing the raw features or labels of another.

Key components in the 3GPP VFL framework include the Network Data Analytics Function (NWDAF) which can act as a participant or coordinator, standardized interfaces for federated learning orchestration (e.g., Naf_FederatedLearning), and security protocols for secure aggregation and model exchange. The architecture is designed to integrate with the 5G Service-Based Architecture (SBA), allowing network functions like the AMF, SMF, and PCF to contribute data to federated learning processes. VFL's role is to unlock the value of partitioned data silos within the telecom ecosystem, enabling advanced AI/ML use cases such as joint network-service optimization, churn prediction, and personalized QoS management, while strictly adhering to data privacy regulations like GDPR.

Purpose & Motivation

VFL was introduced to address the critical challenge of data silos and privacy constraints that hinder the development of advanced AI-driven network and service management. In the telecom industry, valuable data is fragmented across operators, vendors, and service providers. For example, an operator has detailed network performance data, while a content provider has rich application behavior data. Individually, these datasets provide a limited view; combined, they could power highly accurate predictive models. However, legal, regulatory, and competitive barriers prevent the sharing or centralization of this raw data. Traditional methods of data pooling or model training on centralized datasets are thus infeasible, limiting the potential of AI in 5G and beyond.

The standardization of VFL in 3GPP Release 19 was motivated by the need to foster a trusted data collaboration ecosystem for 6G preparation and advanced 5G-Advanced networks. It solves the problem by providing a standardized, secure framework for collaborative learning that preserves data sovereignty. This enables participants to benefit from the combined predictive power of distributed feature sets while providing technical and procedural guarantees that raw data never leaves its owner's control. VFL unlocks new business models and operational efficiencies, such as co-developing churn prediction models with banking partners or optimizing video streaming jointly with content delivery networks, all within a privacy-by-design framework that builds trust among stakeholders.

Key Features

  • Privacy-preserving entity alignment using techniques like Private Set Intersection (PSI)
  • Split-model architecture with local bottom models and a collaboratively trained top model
  • Secure gradient and loss computation using homomorphic encryption or secure multi-party computation
  • Integration with 5G Service-Based Architecture and the NWDAF for network analytics use cases
  • Standardized interfaces and procedures for federated learning orchestration and lifecycle management
  • Support for heterogeneous data owners with vertically partitioned feature spaces

Evolution Across Releases

Rel-19 Initial

Initial standardization of Vertical Federated Learning architecture, procedures, and security requirements. Specifications defined the functional roles, reference architecture, and interfaces for VFL within the 5G system, enabling privacy-preserving collaborative AI model training across network operators and third parties.

Defining Specifications

SpecificationTitle
TS 21.905 3GPP TS 21.905
TS 23.288 3GPP TS 23.288
TS 23.482 3GPP TS 23.482
TS 23.700 3GPP TS 23.700
TS 24.560 3GPP TS 24.560
TS 28.105 3GPP TS 28.105
TS 28.858 3GPP TS 28.858
TS 29.510 3GPP TS 29.510
TS 29.520 3GPP TS 29.520
TS 29.530 3GPP TS 29.530
TS 29.552 3GPP TS 29.552
TS 29.591 3GPP TS 29.591
TS 33.501 3GPP TR 33.501
TS 33.784 3GPP TR 33.784