Description
The Visual Positioning System (VPS) is a service capability within 3GPP networks designed to provide highly accurate positioning by leveraging visual information from user equipment (UE). Unlike traditional methods relying solely on radio signals (e.g., GNSS, OTDOA), VPS analyzes visual features from the device's camera feed. The core architecture involves the UE capturing images or video, extracting visual features (like keypoints and descriptors), and sending this data to a network-based positioning server, often via a Location Management Function (LMF) or a dedicated VPS server. The server compares these features against a pre-built visual map or database of geo-referenced visual landmarks. By matching the observed features to known locations in the database, the server calculates the UE's precise 3D position (latitude, longitude, altitude) and orientation (yaw, pitch, roll), then returns this estimate to the UE or a requesting application.
Key components include the UE with camera and processing capabilities, the visual map database (which can be cloud-based or distributed), and the positioning server that executes the matching algorithms. The VPS server may also fuse the visual positioning result with other sensor inputs from the UE, such as inertial measurement unit (IMU) data, Wi-Fi fingerprints, or cellular measurements, to improve accuracy, robustness, and continuity, especially during periods of poor visual feature tracking. The system typically operates in conjunction with the LTE Positioning Protocol (LPP) or NR Positioning Protocol (NRPPa) for the signaling exchange between the UE and the network.
VPS's role is to complement and enhance the overall 3GPP positioning framework defined in specifications like 23.273. It addresses scenarios where radio-based positioning is insufficient, such as dense urban canyons or indoor environments where satellite signals are weak or blocked. The service enables a new class of applications requiring centimeter-to-meter level accuracy and six degrees of freedom (6DoF) pose estimation, which are critical for immersive augmented reality (AR), precise indoor navigation, and context-aware services. The integration into 3GPP standards ensures VPS can be deployed as a managed, scalable service with defined quality of service, security, and privacy controls.
Purpose & Motivation
VPS was created to solve the fundamental limitation of existing radio-frequency-based positioning technologies in environments with poor satellite or cellular signal coverage, particularly indoors and in dense urban areas. Traditional methods like GPS, A-GNSS, and OTDOA often fail or provide inadequate accuracy (tens of meters) in such scenarios, hindering the development of precise location-based services. The proliferation of smartphones with high-quality cameras and advanced processing power presented an opportunity to use visual data as a rich source of positioning information, motivating its standardization to ensure interoperability and wide-scale deployment.
The historical context includes the growing demand for augmented reality applications, accurate indoor navigation (e.g., in airports, malls, museums), and industrial automation, all of which require precise pose estimation. Prior to VPS, solutions relied on proprietary technologies, beacons, or fingerprinting, which lacked standardization and scalability. By integrating VPS into the 3GPP ecosystem, starting from early studies in R99 and more concrete work in later releases, the aim was to provide a unified, network-assisted positioning service that leverages the ubiquity of mobile cameras. This addresses the limitations of previous approaches by offering a solution that works where RF signals are weak, provides orientation data, and can be seamlessly integrated with other cellular services for a consistent user experience.
Key Features
- Uses visual feature matching against a geo-referenced database for positioning
- Provides six degrees of freedom (6DoF) pose estimation (position and orientation)
- Operates in GPS-denied environments like indoors and urban canyons
- Network-assisted architecture with positioning server and visual map data
- Fusion with other sensor data (IMU, Wi-Fi, cellular) for improved robustness
- Standardized signaling via LPP/NRPPa for interoperability
Evolution Across Releases
Initial concept and study phase for location services. Visual positioning was considered as a potential future enhancement, with foundational work on network-assisted positioning architectures laid out, though VPS-specific protocols were not yet defined.
Enhanced location service capabilities and architectures. Continued studies on hybrid positioning methods, setting the stage for incorporating non-RF data sources like visual information into the positioning framework.
Increased focus on indoor positioning and new use cases. Specific study items and work items began to formally evaluate visual and sensor-based positioning techniques as part of the broader LTE positioning enhancements.
Standardization of VPS began in earnest within the context of enhanced LTE positioning. Initial specifications defined requirements, architecture, and procedures for network-assisted visual positioning, including data formats for visual feature exchange.
Further enhancements to VPS procedures and performance requirements. Integration with other technologies like pedestrian dead reckoning (PDR) and improved support for commercial use cases, particularly for augmented reality applications.
Significant enhancements for industrial IoT and ultra-reliable low-latency communication (URLLC) use cases. Improved accuracy, reduced latency for VPS queries, and better integration with 3GPP's overall architecture for integrated access and backhaul (IAB).
Expanded VPS capabilities for sidelink positioning and vehicle-to-everything (V2X) scenarios. Support for collaborative positioning where devices share visual data, and enhancements for power efficiency and scalability.
Continued evolution for advanced AR/VR and metaverse applications. Focus on improving visual map management, privacy-preserving techniques for visual data, and support for edge computing deployments of VPS servers.
Ongoing work to refine VPS performance, reduce signaling overhead, and explore AI/ML-based visual feature extraction and matching. Further integration with network sensing capabilities and non-terrestrial networks (NTN).
Defining Specifications
| Specification | Title |
|---|---|
| TS 23.039 | 3GPP TS 23.039 |
| TS 24.501 | 3GPP TS 24.501 |
| TS 26.223 | 3GPP TS 26.223 |
| TS 26.234 | 3GPP TS 26.234 |
| TS 26.522 | 3GPP TS 26.522 |
| TS 26.906 | 3GPP TS 26.906 |
| TS 26.928 | 3GPP TS 26.928 |
| TS 26.948 | 3GPP TS 26.948 |
| TS 29.525 | 3GPP TS 29.525 |