Cognitive assistants (CA) are chatbots that provide context-aware support to human workers in knowledge-intensive tasks. Traditionally, cognitive assistants respond in specific ways to predefined user intents and conversation patterns. However, this rigidness does not handle the diversity of natural language well. Recent advances in natural language processing (NLP), powering large language models (LLM) such as GPT-4, Llama2, and Gemini, could enable CAs to converse in a more flexible, human-like manner. However, the additional degrees of freedom may have unforeseen consequences, especially in knowledge-intensive contexts where accuracy is crucial. As a preliminary step to assessing the potential of using LLMs in these contexts, we conducted a user study comparing an LLM-based CA to an intent-based system regarding interaction efficiency, user experience, workload, and usability. This revealed that LLM-based CAs exhibited better user experience, task completion rate, usability, and perceived performance than intent-based systems, suggesting that switching NLP techniques should be investigated further.
翻译:认知助手(CA)是在知识密集型任务中为人类工作者提供情境感知支持的聊天机器人。传统上,认知助手通过预定义的用户意图和对话模式以特定方式响应,但这种刻板性难以应对自然语言的多样性。近年来,驱动GPT-4、Llama2和Gemini等大型语言模型(LLM)的自然语言处理(NLP)技术取得重大进展,有望使认知助手以更灵活、更类人的方式进行对话。然而,额外的自由度可能带来不可预见的后果,尤其是在对准确性至关重要的知识密集型场景中。作为评估在此类场景中使用LLM潜力的初步探索,我们开展了一项用户研究,从交互效率、用户体验、工作负荷和可用性维度,将基于LLM的认知助手与基于意图的系统进行对比。研究发现,基于LLM的认知助手在用户体验、任务完成率、可用性和感知性能方面均优于基于意图的系统,这表明有必要进一步研究NLP技术的切换策略。