Current large language model agents predominantly operate under a reactive paradigm, responding only to immediate user queries within short-term sessions. This limitation hinders their ability to maintain long-term user's intents and dynamically adapt to evolving external environments. In this paper, we propose a novel interaction paradigm for proactive Task-oriented Agents capable of bridging the gap between relatively static user's needs and a dynamic environment. We formalize proactivity through two key capabilities, (i) Intent-Conditioned Monitoring: The agent autonomously formulates trigger conditions based on dialog history; (ii) Event-Triggered Follow-up: The agent actively engages the user upon detecting useful environmental updates. We introduce a high-quality data synthesis pipeline to construct complex, multi-turn dialog data in a dynamic environment. Furthermore, we attempt to address the lack of evaluation criteria of task-oriented interaction in a dynamic environment by proposing a new benchmark, namely ChronosBench. We evaluated some leading close-source and open-source models at present and revealed their flaws in long-term task-oriented interaction. Furthermore, our fine-tuned model trained using synthetic data for supervised learning achieves a task completion rate of 85.19% for complex tasks including shifts in user intent, outperforming other models under test. And the result validated the effectiveness of our data-driven strategy.
翻译:当前的大型语言模型智能体主要采用被动响应范式,仅在短期会话中回应用户的即时查询。这种局限性阻碍了其维护用户长期意图及动态适应演化外部环境的能力。本文提出一种面向主动式任务导向型智能体的新型交互范式,旨在弥合相对静态的用户需求与动态环境之间的鸿沟。我们通过两个核心能力形式化定义主动特性:(i)意图条件监控:智能体基于对话历史自主制定触发条件;(ii)事件触发跟进:智能体在检测到有效环境更新时主动与用户交互。我们引入高质量数据合成流程,用于构建动态环境中的复杂多轮对话数据。此外,针对动态环境中任务导向型交互评估标准的缺失,我们提出了名为ChronosBench的新基准测试集。通过对当前主流闭源与开源模型的评估,揭示了它们在长期任务导向型交互中的缺陷。进一步地,我们使用合成数据进行监督学习微调的模型,在包含用户意图转移的复杂任务中实现了85.19%的任务完成率,优于其他测试模型。该结果验证了我们数据驱动策略的有效性。