Process-driven dialogue systems, which operate under strict predefined process constraints, are essential in customer service and equipment maintenance scenarios. Although Large Language Models (LLMs) have shown remarkable progress in dialogue and reasoning, they still struggle to solve these strictly constrained dialogue tasks. To address this challenge, we construct Process Flow Dialogue (PFDial) dataset, which contains 12,705 high-quality Chinese dialogue instructions derived from 440 flowcharts containing 5,055 process nodes. Based on PlantUML specification, each UML flowchart is converted into atomic dialogue units i.e., structured five-tuples. Experimental results demonstrate that a 7B model trained with merely 800 samples, and a 0.5B model trained on total data both can surpass 90% accuracy. Additionally, the 8B model can surpass GPT-4o up to 43.88% with an average of 11.00%. We further evaluate models' performance on challenging backward transitions in process flows and conduct an in-depth analysis of various dataset formats to reveal their impact on model performance in handling decision and sequential branches. The data is released in https://github.com/KongLongGeFDU/PFDial.
翻译:流程驱动的对话系统在严格的预定义流程约束下运行,在客户服务和设备维护场景中至关重要。尽管大语言模型在对话和推理方面已展现出显著进展,它们仍难以解决此类严格约束的对话任务。为应对这一挑战,我们构建了流程对话数据集,该数据集包含源自440个流程图(涵盖5,055个流程节点)的12,705条高质量中文对话指令。基于PlantUML规范,每个UML流程图被转换为原子对话单元,即结构化五元组。实验结果表明:仅用800个样本训练的7B模型,以及使用全部数据训练的0.5B模型,其准确率均可超过90%。此外,8B模型能超越GPT-4o最高达43.88%,平均超越幅度为11.00%。我们进一步评估了模型在流程逆向跳转挑战性任务上的表现,并对不同数据格式进行了深入分析,以揭示其对模型处理决策分支与顺序分支能力的影响。数据已发布于https://github.com/KongLongGeFDU/PFDial。