Process-driven dialogue systems, which operate under strict predefined process constraints, are essential in customer service and equipment maintenance scenarios. Although Large Language Models (LLMs) have shown remarkable progress in dialogue and reasoning, they still struggle to solve these strictly constrained dialogue tasks. To address this challenge, we construct Process Flow Dialogue (PFDial) dataset, which contains 12,705 high-quality Chinese dialogue instructions derived from 440 flowcharts containing 5,055 process nodes. Based on PlantUML specification, each UML flowchart is converted into atomic dialogue units i.e., structured five-tuples. Experimental results demonstrate that a 7B model trained with merely 800 samples, and a 0.5B model trained on total data both can surpass 90% accuracy. Additionally, the 8B model can surpass GPT-4o up to 43.88% with an average of 11.00%. We further evaluate models' performance on challenging backward transitions in process flows and conduct an in-depth analysis of various dataset formats to reveal their impact on model performance in handling decision and sequential branches. The data is released in https://github.com/KongLongGeFDU/PFDial.
翻译:流程驱动的对话系统在严格的预定义流程约束下运行,在客户服务和设备维护场景中至关重要。尽管大语言模型(LLM)在对话和推理方面取得了显著进展,但在解决这类严格约束的对话任务时仍面临困难。为应对这一挑战,我们构建了流程对话(PFDial)数据集,该数据集包含来自440个流程图(涵盖5,055个流程节点)衍生的12,705条高质量中文对话指令。基于PlantUML规范,每个UML流程图被转换为原子对话单元,即结构化五元组。实验结果表明,仅用800个样本训练的7B模型,以及在全部数据上训练的0.5B模型,其准确率均可超过90%。此外,8B模型在部分任务上可超越GPT-4o高达43.88%,平均超越幅度为11.00%。我们进一步评估了模型在流程图中具有挑战性的逆向跳转任务上的表现,并对不同数据集格式进行了深入分析,以揭示其对模型处理决策分支和顺序分支能力的影响。数据已发布于 https://github.com/KongLongGeFDU/PFDial。