Clarifying questions are an integral component of modern information retrieval systems, directly impacting user satisfaction and overall system performance. Poorly formulated questions can lead to user frustration and confusion, negatively affecting the system's performance. This research addresses the urgent need to identify and leverage key features that contribute to the classification of clarifying questions, enhancing user satisfaction. To gain deeper insights into how different features influence user satisfaction, we conduct a comprehensive analysis, considering a broad spectrum of lexical, semantic, and statistical features, such as question length and sentiment polarity. Our empirical results provide three main insights into the qualities of effective query clarification: (1) specific questions are more effective than generic ones; (2) the subjectivity and emotional tone of a question play a role; and (3) shorter and more ambiguous queries benefit significantly from clarification. Based on these insights, we implement feature-integrated user satisfaction prediction using various classifiers, both traditional and neural-based, including random forest, BERT, and large language models. Our experiments show a consistent and significant improvement, particularly in traditional classifiers, with a minimum performance boost of 45\%. This study presents invaluable guidelines for refining the formulation of clarifying questions and enhancing both user satisfaction and system performance.
翻译:澄清问题是现代信息检索系统的核心组成部分,直接关系到用户满意度和系统整体性能。表述不当的问题可能导致用户产生困惑与挫败感,从而对系统性能产生负面影响。本研究聚焦于识别并利用有助于分类澄清问题的关键特征,以提升用户满意度这一迫切需求。为深入探究不同特征对用户满意度的影响机制,我们开展了涵盖词汇、语义和统计特征(如问题长度和情感极性)的全面分析。实证研究结果揭示了有效查询澄清的三个关键特性:(1)具体问题比泛化问题更有效;(2)问题的主观性和情感基调具有显著影响;(3)较短且模糊的查询能从澄清中获益更多。基于上述发现,我们采用包括随机森林、BERT及大语言模型在内的传统与神经网络分类器,构建了特征融合的用户满意度预测模型。实验结果表明,所有分类器均获得一致且显著的性能提升,其中传统分类器的性能提升最低达45%。本研究为优化澄清问题的表述方式、提升用户满意度及系统性能提供了宝贵的指导准则。