Information access systems, such as search engines, recommender systems, and conversational assistants, have become integral to our daily lives as they help us satisfy our information needs. However, evaluating the effectiveness of these systems presents a long-standing and complex scientific challenge. This challenge is rooted in the difficulty of assessing a system's overall effectiveness in assisting users to complete tasks through interactive support, and further exacerbated by the substantial variation in user behaviour and preferences. To address this challenge, user simulation emerges as a promising solution. This book focuses on providing a thorough understanding of user simulation techniques designed specifically for evaluation purposes. We begin with a background of information access system evaluation and explore the diverse applications of user simulation. Subsequently, we systematically review the major research progress in user simulation, covering both general frameworks for designing user simulators, utilizing user simulation for evaluation, and specific models and algorithms for simulating user interactions with search engines, recommender systems, and conversational assistants. Realizing that user simulation is an interdisciplinary research topic, whenever possible, we attempt to establish connections with related fields, including machine learning, dialogue systems, user modeling, and economics. We end the book with a detailed discussion of important future research directions, many of which extend beyond the evaluation of information access systems and are expected to have broader impact on how to evaluate interactive intelligent systems in general.
翻译:信息访问系统,如搜索引擎、推荐系统和对话助手,已成为我们日常生活中不可或缺的部分,它们帮助我们满足信息需求。然而,评估这些系统的有效性是一个长期存在且复杂的科学挑战。这一挑战源于评估系统通过交互支持协助用户完成任务整体有效性的困难,并因用户行为和偏好的巨大差异而进一步加剧。为应对这一挑战,用户模拟成为一种有前景的解决方案。本书重点在于深入理解专为评估目的设计的用户模拟技术。我们首先介绍信息访问系统评估的背景,并探讨用户模拟的多样化应用。随后,我们系统回顾用户模拟的主要研究进展,涵盖设计用户模拟器的通用框架、利用用户模拟进行评估,以及模拟用户与搜索引擎、推荐系统和对话助手交互的具体模型与算法。认识到用户模拟是一个跨学科研究主题,我们尽可能尝试建立与相关领域的联系,包括机器学习、对话系统、用户建模和经济学。本书最后详细讨论了重要的未来研究方向,其中许多方向超越了信息访问系统的评估范畴,预计将对如何评估交互式智能系统产生更广泛的影响。