Designing Multi-Robot Ground Video Sensemaking with Public Safety Professionals
Abstract
Videos from fleets of ground robots can advance public safety by providing scalable situational awareness and reducing professionals' burden. Yet little is known about how to design and integrate multi-robot videos into public safety workflows. Collaborating with six police agencies, we examined how such videos could be made practical.
In Study 1, we presented the first testbed for multi-robot ground video sensemaking. The testbed includes 38 events-of-interest (EoI) relevant to public safety, a dataset of 20 robot patrol videos (10 day/night pairs) covering EoI types, and 6 design requirements aimed at improving current video sensemaking practices.
In Study 2, we built MRVS, a tool that augments multi-robot patrol video streams with a prompt-engineered video understanding model. Participants reported reduced manual workload and greater confidence with LLM-based explanations, while noting concerns about false alarms and privacy. We conclude with implications for designing future multi-robot video sensemaking tools.
Dataset Preview
Examples of different anomalies in our video dataset shown in sequences. Each second column is manually zoomed in.
Video Presentation
Short overview of the MRVS system.
BibTeX
@inproceedings{zhou2026designing,
author = {Zhou, Puqi and Asgarov, Ali and Hussain, Aafiya and Park, Wonjoon and Paudyal, Amit and Shrestha, Sameep and Tang, Chia-Wei and Lighthiser, Michael and Hieb, Michael and Xiao, Xuesu and Thomas, Chris and Hong, Sungsoo Ray},
title = {Designing Multi-Robot Ground Video Sensemaking with Public Safety Professionals},
booktitle = {Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems},
year = {2026},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
note = {To appear},
}