Flowcharts are widely used in industrial requirements, but usually remain embedded as static images. Vision Language Models (VLMs) show promise in the conversion of these flowcharts into machine-readable models for RE activities, yet, when directly applied to flowchart conversion, they often fail on topology-critical visual details. To address this, we propose EdgeFlow that augments a VLM's original input with a deterministically extracted Canny edge map-acting as a structural prior-to improve flowchart-to-Mermaid conversion, without requiring annotated training data or domain-specific model fine-tuning. We evaluate EdgeFlow on IndusReqFlow, a dataset sourced from real-world requirements. Compared with off-the-shelf VLMs, EdgeFlow improves node-level F1 by 17.39 percentage points and edge-level F1 by 16.94 percentage points. At the path level, EdgeFlow improves path F1 by 11.06 percentage points, enabling better support for model-based testing. These results demonstrate that EdgeFlow provides a practical, training-free means to improve topology-preserving flowchart-to-Mermaid conversion for industrial RE. Cross-dataset evaluation results on a public synthetic benchmark show no significant improvement; this highlights the need for diverse benchmarks incorporating industrial data for the comprehensive evaluation of future VLM-based RE tools.
翻译:流程图广泛应用于工业需求中,但通常以静态图像形式嵌入。视觉语言模型(VLM)在将这些流程图转换为机器可读模型以支持需求工程(RE)活动方面展现出潜力,然而直接应用于流程图转换时,VLM在处理拓扑关键视觉细节时常常失败。为此,我们提出EdgeFlow,该方法通过向VLM的原始输入中添加确定性提取的Canny边缘图(作为结构先验),在不依赖标注训练数据或领域特定模型微调的情况下,改进流程图到Mermaid的转换。我们在源自真实世界需求的数据集IndusReqFlow上评估了EdgeFlow。与现成的VLM相比,EdgeFlow在节点级F1分数上提升了17.39个百分点,在边级F1分数上提升了16.94个百分点。在路径级上,EdgeFlow将路径F1分数提升了11.06个百分点,从而更好地支持基于模型的测试。这些结果表明,EdgeFlow为实现工业RE中保留拓扑结构的流程图到Mermaid转换提供了一种实用的、无需训练的手段。在公开合成基准上的跨数据集评估结果未显示显著改进,这凸显了在全面评估未来基于VLM的RE工具时,需要纳入工业数据的多样化基准。