Demonstrating Multi-modal Human Instruction Comprehension with AR Smart Glass
Published in 2023 15th International Conference on COMmunication Systems & NETworkS (COMSNETS), 2023
Recommended citation: Weerakoon, D., Subbaraju, V., Tran, T. and Misra, A., 2023, January. Demonstrating Multi-modal Human Instruction Comprehension with AR Smart Glass. In 2023 15th International Conference on COMmunication Systems & NETworkS (COMSNETS) (pp. 231-233). IEEE.
We present a multi-modal human instruction comprehension prototype for object acquisition tasks that involve verbal, visual and pointing gesture cues. Our prototype includes an AR smart-glass for issuing the instructions and a Jetson TX2 pervasive device for executing comprehension algorithms. With this setup, we enable on-device, computationally efficient object acquisition task comprehension with an average latency in the range of 150-330msec.