Controllable learning (CL) emerges as a critical component in trustworthy machine learning, ensuring that learners meet predefined targets and can adaptively adjust without retraining according to the changes in those targets. We provide a formal definition of CL, and discuss its applications in information retrieval (IR) where information needs are often complex and dynamic. The survey categorizes CL according to who controls (users or platforms), what is controllable (e.g., retrieval objectives, users' historical behaviors, controllable environmental adaptation), how control is implemented (e.g., rule-based method, Pareto optimization, Hypernetwork), and where to implement control (e.g.,pre-processing, in-processing, post-processing methods). Then, we identify challenges faced by CL across training, evaluation, task setting, and deployment in online environments. Additionally, we outline promising directions for CL in theoretical analysis, efficient computation, empowering large language models, application scenarios and evaluation frameworks in IR.
翻译:可控学习作为可信机器学习的关键组成部分,确保学习器能够满足预设目标,并可根据目标变化自适应调整而无需重新训练。本文给出了可控学习的正式定义,并探讨其在信息检索中的应用——该领域的信息需求通常具有复杂性和动态性。本综述依据控制主体(用户或平台)、控制对象(如检索目标、用户历史行为、可控环境适应)、控制实现方式(如基于规则的方法、帕累托优化、超网络)及控制实施阶段(如预处理、过程处理、后处理方法)对可控学习进行分类。随后,我们分析了可控学习在训练、评估、任务设置及在线环境部署中面临的挑战。此外,本文展望了可控学习在理论分析、高效计算、赋能大语言模型、信息检索应用场景及评估框架等方面的潜在发展方向。