Video-Based Design (VBD) uses video as a primary medium for analyzing user interactions, prototyping, and generating design insights. However, current VBD workflows are constrained by labor-intensive, inconsistent manual annotations that fragment attention and delay insights. Computer Vision (CV)-powered automatic annotation offers opportunities to reduce manual effort while supporting higher-level interpretation. This paper investigates human-AI collaboration in video analysis by examining how different levels of automated support shape user experience in VBD. We developed MarkupLens, a CV-assisted annotation platform, and conducted a between-subjects eye-tracking study with 36 designers in an urban VBD case. We compared three levels of automation: no support, partial support, and full support, and found that higher levels improved annotation quality, reduced cognitive load, and interestingly, enriched reflection. Our insights on automation levels inform adjustable autonomy and mixed-initiative system design beyond VBD tasks.
翻译:基于视频的设计(VBD)将视频作为分析用户交互、原型制作和生成设计洞察的主要媒介。然而,当前的VBD工作流程受限于劳动密集型、不一致的手动标注,这些标注分散了注意力并延迟了洞察的获取。计算机视觉(CV)驱动的自动标注为减少人工工作量同时支持更高层次的解释提供了机会。本文通过研究不同级别的自动化支持如何影响用户在VBD中的体验,探讨了视频分析中的人机协作。我们开发了MarkupLens,一个CV辅助的标注平台,并在一个城市VBD案例中对36名设计师进行了受试者间眼动追踪研究。我们比较了三种自动化级别:无支持、部分支持和完全支持,发现更高级别的自动化提高了标注质量,降低了认知负荷,并且有趣的是,丰富了反思。我们关于自动化级别的见解为超越VBD任务的可调自主性和混合主动系统设计提供了参考。