Talks and presentations

Conference talk on COSM2IC: Optimizing Real-Time Multi-Modal Instruction Comprehension

October 26, 2022

Conference proceedings talk, 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems, Kyoto, Japan

Supporting real-time, on-device execution of multi- modal referring instruction comprehension models is an important challenge to be tackled in embodied Human-Robot Interaction. However, state-of-the-art deep learning models are resource intensive and unsuitable for real-time execution on embedded devices. While model compression can achieve reduction in computational resources upto a certain point, further optimizations result in a severe drop in accuracy (upto 50%). To minimize this loss in accuracy, we propose the COSM2IC framework, with a lightweight Task Complexity Predictor, that uses multiple sensor inputs to assess the instructional complexity and thereby dynamically switch between a set of models of varying computational intensity such that computationally less demanding models are invoked whenever possible. To demonstrate the benefits of COSM2IC, we utilize a representative human-robot collaborative “table-top target acquisition” task, to curate a new multi-modal instruction dataset where a human issues instructions in a natural manner using a combination of visual, verbal and gestural (pointing) cues. We show that COSM2IC achieves a 3-fold reduction in comprehension latency when compared to a baseline DNN model while suffering an accuracy loss of only ∼5%. When compared to state-of-the-art model compression methods COSM2IC is able to achieve a further 30% reduction in latency and energy consumption for a comparable performance.

Conference talk on Resilient Collaborative Intelligence for Adversarial IoT Environments

July 03, 2019

Conference proceedings talk, 2019 22th International Conference on Information Fusion (FUSION), Remote

Many IoT networks, including for battlefield deployments, involve the deployment of resource-constrained sensors with varying degrees of redundancy/overlap (i.e., their data streams possess significant spatiotemporal correlation). Collaborative intelligence, whereby individual nodes adjust their inferencing pipelines to incorporate such correlated observations from other nodes, can improve both inferencing accuracy and performance metrics (such as latency and energy overheads). Using realworld data from a multicamera deployment, we first demonstrate the significant performance gains (up to 14% increase in accuracy) from such collaborative intelligence, achieved through two different approaches: (a) one involving statistical fusion of outputs from different nodes, and (b) another involving the development of new collaborative deep neural networks (DNNs). We then show that these collaboration-driven performance gains are susceptible to adversarial behaviour by one or more nodes, and thus need resilient mechanisms to provide robustness against such malicious behaviour. We also introduce an under-development testbed at Singapore Management University (SMU), specifically designed to enable real-world experimentation with such collaborative IoT intelligence techniques.