Proactive robot assistance in household environments requires accurate prediction of human activities and object usage under dynamic and noisy conditions. Existing approaches often rely on complex spatio-temporal models, which can be computationally expensive and sensitive to environmental variability. In this paper, we propose GLOBE, a lightweight framework that combines n-gram Markov models for capturing temporal behavioral patterns with uncertainty-guided large language model (LLM) reasoning. The framework performs sequential prediction efficiently while selectively invoking LLM reasoning only when the model confidence is low. To evaluate performance under realistic conditions, we introduce HOMER-Noise, a noisy extension of the HOMER+ dataset that simulates structured disturbances such as object movements caused by humans, pets, and toddlers. Experimental results show that GLOBE achieves competitive performance with state-of-the-art methods while improving robustness and computational efficiency across both clean and noisy settings. The framework is further validated through a proof-of-concept integration with a Stretch 3 mobile manipulator, demonstrating its potential application in real-world human-robot interaction scenarios.
翻译:居家环境中的主动机器人辅助需要在动态且嘈杂的条件下准确预测人类活动及物体使用情况。现有方法通常依赖复杂的时空模型,这类模型计算成本高且对环境变化敏感。本文提出一种轻量级框架GLOBE,该框架结合n-gram马尔可夫模型捕捉时间行为模式,并集成不确定性引导的大语言模型推理。该框架能高效执行序列预测,仅在模型置信度较低时选择性地调用LLM推理。为评估在真实条件下的性能,我们引入HOMER-Noise数据集——HOMER+数据集的一个噪声扩展版本,该数据集模拟了人类、宠物及幼儿导致的物体移动等结构化干扰。实验结果表明,GLOBE在干净与嘈杂两种环境下均能达到与最先进方法相当的性能,同时显著提升了鲁棒性和计算效率。通过将框架与Stretch 3移动机械臂进行概念验证集成,进一步验证了其在真实人机交互场景中的潜在应用价值。