VGGlass - Demonstrating Visual Grounding and Localization Synergy with a LiDAR-enabled Smart-Glass

Published in 21th ACM Conference on Embedded Networked Sensor Systems (SenSys 2023), 2023

Recommended citation: Rathnayake, D., Weerakoon, D., Radhakrishnan, M., Subbaraju, V., Hwang, I. and Misra, A., 2023, November. VGGlass - Demonstrating Visual Grounding and Localization Synergy with a LiDAR-enabled Smart-Glass. In 21th ACM Conference on Embedded Networked Sensor Systems (SenSys 2023) [In Press]

Demo Abstract: This work demonstrates the VGGlass system, which simultaneously interprets human instructions for a target acquisition task and determines the precise 3D positions of both user and the target object. This is achieved by utilizing LiDARs mounted in the infrastructure and a smart glass device worn by the user. Key to our system is the union of LiDAR-based localization termed LiLOC and a multi-modal visual grounding approach termed RealG(2)In-Lite. To demonstrate the system, we use Intel RealSense L515 cameras and a Microsoft HoloLens 2, as the user devices. VGGlass is able to: a) track the user in real-time in a global coordinate system, and b) locate target objects referred by natural language and pointing gestures.