Autocomplete suggestions are fundamental to modern text entry systems, with applications in domains such as messaging and email composition. Typically, autocomplete suggestions are generated from a language model with a confidence threshold. However, this threshold does not directly take into account the cognitive load imposed on the user by surfacing suggestions, such as the effort to switch contexts from typing to reading the suggestion, and the time to decide whether to accept the suggestion. In this paper, we study the problem of improving inline autocomplete suggestions in text entry systems via a sequential decision-making formulation, and use reinforcement learning to learn suggestion policies through repeated interactions with a target user over time. This formulation allows us to factor cognitive load into the objective of training an autocomplete model, through a reward function based on text entry speed. We acquired theoretical and experimental evidence that, under certain objectives, the sequential decision-making formulation of the autocomplete problem provides a better suggestion policy than myopic single-step reasoning. However, aligning these objectives with real users requires further exploration. In particular, we hypothesize that the objectives under which sequential decision-making can improve autocomplete systems are not tailored solely to text entry speed, but more broadly to metrics such as user satisfaction and convenience.
翻译:自动补全建议是现代文本输入系统的基础功能,广泛应用于消息传递和电子邮件撰写等领域。通常,自动补全建议由语言模型根据置信度阈值生成。然而,该阈值并未直接考虑向用户展示建议所产生的认知负荷,例如从打字切换到阅读建议所需的上下文转换努力,以及决定是否接受建议的时间。本文通过序列决策框架研究改进文本输入系统中行内自动补全建议的问题,并利用强化学习通过与目标用户的重复交互来学习建议策略。该框架使我们能够通过基于文本输入速度的奖励函数,将认知负荷纳入自动补全模型的训练目标。我们获得了理论和实验证据表明,在某些目标下,自动补全问题的序列决策框架能够提供比短视单步推理更优的建议策略。然而,将这些目标与实际用户需求对齐仍需进一步探索。具体而言,我们假设序列决策能够改进自动补全系统的目标不仅限于文本输入速度,更广泛地涉及用户满意度和便利性等指标。