Open Source Visual Positioning Service

Open Source Visual Positioning Service
by George Mason University

Under the lead of Bo Han, the team at George Mason University (GMU) is developing an open-source visual positioning service (VPS). A spatial map consists of sparse 3D points (i.e., a point cloud) with visual feature descriptors, to store explicit geometric information and semantic knowledge of the surrounding environment in volumetric views. Widely used methods for spatial mapping include structure from motion (SfM) and visual SLAM (simultaneous localization and mapping), which processes images that capture the physical world. During localization, a mobile device extracts its camera image and uploads them to a server that matches them with the 3D features in the spatial map to estimate the device’s 6DoF pose. This method is referred to as image-based localization. In order to protect user privacy, the device can also extract 2D features from the camera image for uploading, instead of directly sending the image.

The GMU team is building the backend service of a VPS by leveraging existing open-source packages for structure from motion and hierarchical visual localization, including Colmap (https://github.com/colmap/colmap) and Hierarchical Localization (HLOC) (https://github.com/cvg/Hierarchical-Localization).

The team has built the mapping and localization service's back-end service by leveraging Colmap.

 

How it works

 
 

The team created a Python wrapper for common functionality of the above tools and exposed them as a RESTFUL API on the Web (https://github.com/wunan96nj/3d-mapping-localization-GPS).

Through this REST API, images can be supplied for reconstruction and other images can be supplied for localization. This RESTFUL API exposes the functions of the backend service on the Web. Its functions include uploading an image to the server under a user's specified workspace, building a 3D map with Colmap the server, and uploading an image to the server to localize the image under a specified workspace. The server also leverages the GPS provided with the uploaded images for reconstruction, scaling the reconstructed model to the actual size, and registering the map's origin to a GPS position. While for localization, once the server gets the pose of the query image in the map coordinate, it sends the pose and the GPS of the map's origin back to the client.

The team has enhanced the RESTFUL API implementation to follow the GeoPose protocol defined by OARC. An intermediate API runs on the same server to process the incoming GoePose requests, calculates the GPS and quaternion in ENU coordinate based on the pose sent back from our service, and then sends the Geopose response to the client following the GeoPose Protocol.

The next step is to accelerate the localization performance with hierarchical localization (https://github.com/cvg/Hierarchical-Localization, https://github.com/naver/kapture-localization).