Data-driven systems depend on task-relevant data, yet data collection pipelines remain passive and indiscriminate. Continuous logging of multimodal sensor streams incurs high storage costs and captures irrelevant data. This paper proposes a declarative framework for intent-driven, on-device data collection that enables selective collection of multimodal sensor data based on high-level user requests. The framework combines natural language interaction with a formally specified domain-specific language (DSL). Large language models translate user-defined requirements into verifiable and composable DSL programs that define conditional triggers across heterogeneous sensors, including cameras, LiDAR, and system telemetry. Empirical evaluation on vehicular and robotic perception tasks shows that the DSL-based approach achieves higher generation consistency and lower execution latency than unconstrained code generation while maintaining comparable detection performance. The structured abstraction supports modular trigger composition and concurrent deployment on resource-constrained edge platforms. This approach replaces passive logging with a verifiable, intent-driven mechanism for multimodal data collection in real-time systems.
翻译:数据驱动系统依赖于任务相关数据,然而数据采集流水线仍处于被动且无差别状态。对多模态传感器流进行持续记录会产生高昂的存储成本并捕获无关数据。本文提出一种面向意图驱动、设备端数据采集的声明式框架,该框架能够基于高层次用户请求选择性采集多模态传感器数据。该框架将自然语言交互与形式化定义的领域特定语言(DSL)相结合。大语言模型将用户定义的需求转化为可验证且可组合的DSL程序,这些程序定义了跨异构传感器(包括摄像头、激光雷达和系统遥测)的条件触发规则。针对车载及机器人感知任务的实验评估表明,与无约束代码生成相比,基于DSL的方法在保持相当检测性能的同时,实现了更高的生成一致性和更低的执行延迟。这种结构化抽象支持模块化触发组合,并能部署在资源受限的边缘平台上。本方法以可验证、意图驱动的机制替代了被动记录方式,适用于实时系统中的多模态数据采集。