Conversational search systems can improve user experience in digital libraries by facilitating a natural and intuitive way to interact with library content. However, most conversational search systems are limited to performing simple tasks and controlling smart devices. Therefore, there is a need for systems that can accurately understand the user's information requirements and perform the appropriate search activity. Prior research on intelligent systems suggested that it is possible to comprehend the functional aspect of discourse (search intent) by identifying the speech acts in user dialogues. In this work, we automatically identify the speech acts associated with spoken utterances and use them to predict the system-level search actions. First, we conducted a Wizard-of-Oz study to collect data from 75 search sessions. We performed thematic analysis to curate a gold standard dataset -- containing 1,834 utterances and 509 system actions -- of human-system interactions in three information-seeking scenarios. Next, we developed attention-based deep neural networks to understand natural language and predict speech acts. Then, the speech acts were fed to the model to predict the corresponding system-level search actions. We also annotated a second dataset to validate our results. For the two datasets, the best-performing classification model achieved maximum accuracy of 90.2% and 72.7% for speech act classification and 58.8% and 61.1%, respectively, for search act classification.
翻译:问答式搜索系统能够通过自然直观的交互方式提升数字图书馆的用户体验。然而,现有问答式搜索系统大多局限于执行简单任务与控制智能设备。因此,亟需能够精准理解用户信息需求并执行相应搜索行为的系统。既往关于智能系统的研究表明,通过识别用户对话中的言语行为,可理解话语的功能层面(搜索意图)。本研究自动识别口语语句对应的言语行为,并利用其预测系统层面的搜索行为。首先,我们采用"绿野仙踪"实验法收集75个搜索会话数据,通过主题分析构建包含1,834条语句与509个系统行为的人类-系统交互黄金标准数据集(涵盖三种信息检索场景)。继而开发基于注意力机制的深度神经网络理解自然语言并预测言语行为,再将言语行为输入模型以预测对应的系统级搜索行为。为验证结果,我们还标注了第二个数据集。在两个数据集上,最优分类模型在言语行为分类任务中分别达到90.2%和72.7%的最高准确率,在搜索行为分类任务中达到58.8%和61.1%的最高准确率。