Clarifying questions are an integral component of modern information retrieval systems, directly impacting user satisfaction and overall system performance. Poorly formulated questions can lead to user frustration and confusion, negatively affecting the system's performance. This research addresses the urgent need to identify and leverage key features that contribute to the classification of clarifying questions, enhancing user satisfaction. To gain deeper insights into how different features influence user satisfaction, we conduct a comprehensive analysis, considering a broad spectrum of lexical, semantic, and statistical features, such as question length and sentiment polarity. Our empirical results provide three main insights into the qualities of effective query clarification: (1) specific questions are more effective than generic ones; (2) the subjectivity and emotional tone of a question play a role; and (3) shorter and more ambiguous queries benefit significantly from clarification. Based on these insights, we implement feature-integrated user satisfaction prediction using various classifiers, both traditional and neural-based, including random forest, BERT, and large language models. Our experiments show a consistent and significant improvement, particularly in traditional classifiers, with a minimum performance boost of 45\%. This study presents invaluable guidelines for refining the formulation of clarifying questions and enhancing both user satisfaction and system performance.
翻译:澄清问题是现代信息检索系统的核心组成部分,直接影响用户满意度与系统整体性能。表述不当的澄清问题可能导致用户困惑与挫败感,进而对系统性能产生负面影响。本研究针对识别和利用关键特征以提升澄清问题分类效果、增强用户满意度的迫切需求展开探索。为深入理解不同特征如何影响用户满意度,我们开展了综合分析,涵盖词汇特征、语义特征及统计特征(如问题长度与情感极性)在内的广泛特征维度。实证结果揭示了有效查询澄清的三个核心特性:(1) 具体问题比泛化问题更具效果;(2) 问题的主观性与情感基调具有重要作用;(3) 简短且模糊的查询能通过澄清获得显著改善。基于这些发现,我们采用传统分类器与神经网络分类器(包括随机森林、BERT及大语言模型)实现了特征融合的用户满意度预测。实验结果表明系统性能获得持续显著提升,传统分类器尤其明显,最低性能提升达45%。本研究为优化澄清问题构建、提升用户满意度与系统性能提供了宝贵指导。