Realistic user simulation is crucial for training and evaluating multi-turn dialogue systems, yet creating simulators that accurately replicate human behavior remains a significant challenge. An effective simulator must expose the failure modes of the systems under evaluation. This work introduces Direct Iterative Adversarial Learning (DIAL), a DPO-based adversarial training framework that iteratively enhances user simulator realism through a competitive dynamic between a generator (user simulator) and a discriminator. When applied to mental health support, a domain characterized by diverse failure types and a critical dependence on realistic user behavior for failure detection, DIAL restores lexical diversity diminished by supervised fine-tuning and reduces discriminator accuracy from near-perfect to near-random levels. The resulting simulator exhibits a strong correlation between simulated and real failure occurrence rates while maintaining low distributional divergence of failure modes. These findings indicate that DIAL is a promising method for developing realistic user simulators in multi-turn dialogue, facilitating rapid, reliable, and cost-effective system evaluation prior to deployment.
翻译:真实用户模拟对于训练和评估多轮对话系统至关重要,然而创建能够准确复现人类行为的模拟器仍是一项重大挑战。一个有效的模拟器必须能够揭示被评估系统的故障模式。本研究提出了直接迭代对抗学习(DIAL),这是一种基于DPO的对抗训练框架,通过生成器(用户模拟器)与判别器之间的竞争动态,迭代地提升用户模拟器的真实性。在心理健康支持这一具有多样化故障类型、且故障检测高度依赖真实用户行为的领域中应用时,DIAL恢复了因监督微调而减弱的词汇多样性,并将判别器准确率从接近完美降至接近随机水平。所得模拟器在模拟故障发生率与真实故障发生率之间表现出强相关性,同时保持了故障模式分布的低散度。这些结果表明,DIAL是开发多轮对话中真实用户模拟器的一种有前景的方法,有助于在系统部署前进行快速、可靠且经济高效的评估。