Recent advances in machine learning and deep learning have led to the widespread use of Conversational AI in many practical applications. However, it is still very challenging to leverage auxiliary information that can provide conversational context or personalized tuning to improve the quality of conversations. For example, there has only been limited research on using an individuals persona information to improve conversation quality, and even state-of-the-art conversational AI techniques are unable to effectively leverage signals from heterogeneous sources of auxiliary data, such as multi-modal interaction data, demographics, SDOH data, etc. In this paper, we present a novel Persona-Coded Poly-Encoder method that leverages persona information in a multi-stream encoding scheme to improve the quality of response generation for conversations. To show the efficacy of the proposed method, we evaluate our method on two different persona-based conversational datasets, and compared against two state-of-the-art methods. Our experimental results and analysis demonstrate that our method can improve conversation quality over the baseline method Poly-Encoder by 3.32% and 2.94% in terms of BLEU score and HR@1, respectively. More significantly, our method offers a path to better utilization of multi-modal data in conversational tasks. Lastly, our study outlines several challenges and future research directions for advancing personalized conversational AI technology.
翻译:近年来,机器学习和深度学习的进展推动了对话式AI在诸多实际应用中的广泛部署。然而,如何有效利用辅助信息(如对话上下文或个性化调优)来提升对话质量仍然极具挑战性。例如,利用个体人物信息改善对话质量的研究十分有限,即便是最先进的对话式AI技术也难以有效利用异构辅助数据源(如多模态交互数据、人口统计学信息、社会决定因素数据等)中的信号。本文提出一种新颖的人物编码多流编码器方法,该方法通过多流编码方案融入人物信息,以提升对话响应生成质量。为验证所提方法的有效性,我们在两个基于人物特征的对话数据集上进行了评估,并与两种最先进方法进行对比。实验结果表明,与基线方法多流编码器相比,我们的方法在BLEU评分和HR@1指标上分别提升了3.32%和2.94%,显著改善了对话质量。更重要的是,该方法为对话任务中多模态数据的更优利用提供了可行路径。最后,本研究梳理了推动个性化对话式AI技术发展的若干挑战与未来研究方向。