Information access systems, such as search engines, recommender systems, and conversational assistants, have become integral to our daily lives as they help us satisfy our information needs. However, evaluating the effectiveness of these systems presents a long-standing and complex scientific challenge. This challenge is rooted in the difficulty of assessing a system's overall effectiveness in assisting users to complete tasks through interactive support, and further exacerbated by the substantial variation in user behaviour and preferences. To address this challenge, user simulation emerges as a promising solution. This book focuses on providing a thorough understanding of user simulation techniques designed specifically for evaluation purposes. We begin with a background of information access system evaluation and explore the diverse applications of user simulation. Subsequently, we systematically review the major research progress in user simulation, covering both general frameworks for designing user simulators, utilizing user simulation for evaluation, and specific models and algorithms for simulating user interactions with search engines, recommender systems, and conversational assistants. Realizing that user simulation is an interdisciplinary research topic, whenever possible, we attempt to establish connections with related fields, including machine learning, dialogue systems, user modeling, and economics. We end the book with a detailed discussion of important future research directions, many of which extend beyond the evaluation of information access systems and are expected to have broader impact on how to evaluate interactive intelligent systems in general.
翻译:信息访问系统(如搜索引擎、推荐系统和对话助手)已融入日常生活,帮助用户满足信息需求。然而,评估这类系统的有效性是一项长期且复杂的科学挑战。这一挑战源于难以衡量系统通过交互式支持辅助用户完成任务的整体效能,而用户行为与偏好的显著差异进一步加剧了评估难度。为应对这一挑战,用户仿真技术逐渐成为极具前景的解决方案。本书旨在系统阐述专为评估设计的用户仿真技术。我们首先介绍信息访问系统评估的背景,探讨用户仿真的多样化应用场景;随后,系统梳理用户仿真领域的主要研究进展,涵盖设计用户模拟器的通用框架、利用用户仿真进行评估的方法,以及模拟用户与搜索引擎、推荐系统和对话助手交互的特定模型与算法。考虑到用户仿真属于跨学科研究主题,我们在可能的情况下尝试建立其与机器学习、对话系统、用户建模、经济学等相关领域的联系。最后,本书详细讨论了未来重要研究方向,其中许多方向不仅限于信息访问系统评估,更可能对交互式智能系统的通用评估方法产生广泛影响。