Autocomplete suggestions are fundamental to modern text entry systems, with applications in domains such as messaging and email composition. Typically, autocomplete suggestions are generated from a language model with a confidence threshold. However, this threshold does not directly take into account the cognitive load imposed on the user by surfacing suggestions, such as the effort to switch contexts from typing to reading the suggestion, and the time to decide whether to accept the suggestion. In this paper, we study the problem of improving inline autocomplete suggestions in text entry systems via a sequential decision-making formulation, and use reinforcement learning to learn suggestion policies through repeated interactions with a target user over time. This formulation allows us to factor cognitive load into the objective of training an autocomplete model, through a reward function based on text entry speed. We acquired theoretical and experimental evidence that, under certain objectives, the sequential decision-making formulation of the autocomplete problem provides a better suggestion policy than myopic single-step reasoning. However, aligning these objectives with real users requires further exploration. In particular, we hypothesize that the objectives under which sequential decision-making can improve autocomplete systems are not tailored solely to text entry speed, but more broadly to metrics such as user satisfaction and convenience.
翻译:自动补全建议是现代文本输入系统的基础,广泛应用于消息传递和电子邮件撰写等领域。通常,自动补全建议由语言模型基于置信度阈值生成。然而,该阈值并未直接考虑向用户展示建议所带来的认知负荷,例如从输入切换到阅读建议的上下文转换成本,以及决定是否接受建议所需的时间。本文通过序贯决策制定方法研究改进文本输入系统中内联自动补全建议的问题,并利用强化学习通过与目标用户随时间推移的重复交互来学习建议策略。这种形式化方法使我们能够通过基于文本输入速度的奖励函数,将认知负荷纳入自动补全模型训练的目标中。我们通过理论和实验证据表明,在特定目标下,自动补全问题的序贯决策制定方法比短视的单步推理能提供更优的建议策略。然而,将这些目标与真实用户对齐仍需进一步探索。具体而言,我们假设使序贯决策能够改进自动补全系统的目标并不仅限于文本输入速度,而应更广泛地涵盖用户满意度和便利性等指标。