Modern data exploration tools often struggle to capture the subtleties of analytical intent, especially when users seek patterns that are difficult to specify using traditional query methods or natural language alone. We introduce a multimodal research probe for querying time-series and geospatial data that integrates free-form sketching, natural language, and visual annotations within a unified interaction space. Users articulate queries by sketching trends or spatial paths and augmenting them with annotations and analytical directives grounded in shared spatial and temporal context. The system employs a hybrid architecture combining geometric sketch matching and visual language models (VLMs) to support queries that interleave pattern matching and semantic constraints. Through a preliminary study with 20 participants, we observed recurring interaction patterns in which participants used spatial, temporal, and visual proximity to relate sketches, annotations, and language. Rather than treating these as isolated inputs, participants relied on their relative placement to disambiguate meaning. We analyze these behaviors as evidence for proximity semantics (PS), a form of deictic disambiguation in which meaning is shaped by the closeness of multimodal elements within a shared interaction space. We present PS as a conceptual lens grounded in observed user behavior, and discuss its implications for the design of future multimodal data exploration systems.
翻译:现代数据探索工具往往难以捕捉分析意图的细微差别,尤其是当用户寻求难以通过传统查询方法或自然语言独立指定的模式时。我们提出了一种用于查询时间序列与地理空间数据的多模态研究探针,该探针将自由手绘、自然语言及视觉注释整合于统一交互空间内。用户通过绘制趋势线或空间路径来表达查询,并基于共享时空语境添加注释与分析指令进行增强。该系统采用结合几何草图匹配与视觉语言模型(VLM)的混合架构,以支持交织模式匹配与语义约束的查询。通过一项包含20名参与者的初步研究,我们观察到重复出现的交互模式:参与者利用空间、时间及视觉邻近性关联草图、注释与语言。参与者并非将这些视为孤立输入,而是依赖其相对位置来消除歧义。我们将这些行为分析为邻近语义(PS)的证据——这是一种通过多模态元素在共享交互空间中的接近程度来塑造意义的指示性消歧形式。我们基于观察到的用户行为将邻近语义(PS)作为一种概念透镜进行阐释,并探讨其对未来多模态数据探索系统设计的启示。