We present a methodology to systematically test conversational recommender systems with regards to conversational breakdowns. It involves examining conversations generated between the system and simulated users for a set of pre-defined breakdown types, extracting responsible conversational paths, and characterizing them in terms of the underlying dialogue intents. User simulation offers the advantages of simplicity, cost-effectiveness, and time efficiency for obtaining conversations where potential breakdowns can be identified. The proposed methodology can be used as diagnostic tool as well as a development tool to improve conversational recommendation systems. We apply our methodology in a case study with an existing conversational recommender system and user simulator, demonstrating that with just a few iterations, we can make the system more robust to conversational breakdowns.
翻译:本文提出了一种系统测试对话推荐系统中对话故障的方法论。该方法通过检查系统与模拟用户针对一组预定义故障类型生成的对话,提取导致故障的对话路径,并依据底层对话意图对其进行特征化描述。用户模拟具有简单性、成本效益和时间效率的优势,能够有效获取可能发生故障的对话实例。所提出的方法论既可作为诊断工具,也可作为开发工具用于改进对话推荐系统。我们通过现有对话推荐系统与用户模拟器的案例研究应用了该方法论,结果表明仅需数次迭代即可显著提升系统对对话故障的鲁棒性。