We propose an algorithm for next query recommendation in interactive data exploration settings, like knowledge discovery for information gathering. The state-of-the-art query recommendation algorithms are based on sequence-to-sequence learning approaches that exploit historical interaction data. Due to the supervision involved in the learning process, such approaches fail to adapt to immediate user feedback. We propose to augment the transformer-based causal language models for query recommendations to adapt to the immediate user feedback using multi-armed bandit (MAB) framework. We conduct a large-scale experimental study using log files from a popular online literature discovery service and demonstrate that our algorithm improves the per-round regret substantially, with respect to the state-of-the-art transformer-based query recommendation models, which do not make use of immediate user feedback. Our data model and source code are available at https://github.com/shampp/exp3_ss
翻译:本文提出了一种适用于交互式数据探索场景(如知识发现与信息搜集)的下一查询推荐算法。当前最先进的查询推荐算法基于利用历史交互数据的序列到序列学习方法。由于学习过程中涉及监督机制,此类方法难以适应即时用户反馈。我们提出通过多臂老虎机框架增强基于Transformer的因果语言模型,使其能够根据即时用户反馈自适应调整查询推荐。通过使用知名在线文献发现服务的日志文件进行大规模实验研究,我们证明相较于未利用即时用户反馈的先进Transformer查询推荐模型,本算法显著改善了单轮遗憾值。我们的数据模型与源代码已发布于https://github.com/shampp/exp3_ss。