Integrated control of wheelchairs and wheelchair-mounted robotic arms (WMRAs) has strong potential to increase independence for users with severe motor limitations, yet existing interfaces often lack the flexibility needed for intuitive assistive interaction. Although data-driven AI methods show promise, progress is limited by the lack of multimodal datasets that capture natural Human-Robot Interaction (HRI), particularly conversational ambiguity in dialogue-driven control. To address this gap, we propose a multimodal data collection framework that employs a dialogue-based interaction protocol and a two-room Wizard-of-Oz (WoZ) setup to simulate robot autonomy while eliciting natural user behavior. The framework records five synchronized modalities: RGB-D video, conversational audio, inertial measurement unit (IMU) signals, end-effector Cartesian pose, and whole-body joint states across five assistive tasks. Using this framework, we collected a pilot dataset of 53 trials from five participants and validated its quality through motion smoothness analysis and user feedback. The results show that the framework effectively captures diverse ambiguity types and supports natural dialogue-driven interaction, demonstrating its suitability for scaling to a larger dataset for learning, benchmarking, and evaluation of ambiguity-aware assistive control.
翻译:轮椅与轮椅搭载机械臂(WMRA)的集成控制具有显著增强重度运动功能受限用户独立性的潜力,然而现有界面往往缺乏实现直观辅助交互所需的灵活性。尽管数据驱动的AI方法展现出前景,但进展受限于缺乏捕捉自然人机交互(HRI)的多模态数据集,特别是对话驱动控制中的会话歧义。为填补这一空白,我们提出一种多模态数据收集框架,该框架采用基于对话的交互协议和双房间Wizard-of-Oz(WoZ)实验设置,在激发自然用户行为的同时模拟机器人自主性。该框架同步记录五种模态数据:RGB-D视频、会话音频、惯性测量单元(IMU)信号、末端执行器笛卡尔位姿以及覆盖五项辅助任务的全身关节状态。利用此框架,我们收集了来自五名参与者的53次试验试点数据集,并通过运动平滑度分析与用户反馈验证了数据质量。结果表明,该框架能有效捕捉多种歧义类型并支持自然的对话驱动交互,证明了其适用于扩展为更大规模数据集以用于学习、基准测试和歧义感知辅助控制评估。