Enterprises rely on RDF knowledge graphs and SPARQL to expose operational data through natural language interfaces, yet public KGQA benchmarks do not reflect proprietary schemas, prefixes, or query distributions. We present PIPE-RDF, a three-phase pipeline that constructs schema-specific NL-SPARQL benchmarks using reverse querying, category-balanced template generation, retrieval-augmented prompting, deduplication, and execution-based validation with repair. We instantiate PIPE-RDF on a fixed-schema company-location slice (5,000 companies) derived from public RDF data and generate a balanced benchmark of 450 question-SPARQL pairs across nine categories. The pipeline achieves 100% parse and execution validity after repair, with pre-repair validity rates of 96.5%-100% across phases. We report entity diversity metrics, template coverage analysis, and cost breakdowns to support deployment planning. We release structured artifacts (CSV/JSONL, logs, figures) and operational metrics to support model evaluation and system planning in real-world settings. Code is available at https://github.com/suraj-ranganath/PIPE-RDF.
翻译:企业依赖RDF知识图谱和SPARQL通过自然语言接口公开运营数据,然而公开的KGQA基准测试并未反映专有模式、前缀或查询分布。本文提出PIPE-RDF,这是一个三阶段流水线,通过反向查询、类别平衡模板生成、检索增强提示、去重以及基于执行的验证与修复,构建特定于模式的NL-SPARQL基准测试。我们在从公开RDF数据衍生的固定模式公司-地理位置切片(5,000家公司)上实例化PIPE-RDF,并生成了一个包含九个类别、共计450个问题-SPARQL对的平衡基准测试集。该流水线在修复后实现了100%的解析与执行有效性,各阶段修复前的有效性率为96.5%至100%。我们报告了实体多样性指标、模板覆盖分析以及成本细分,以支持部署规划。我们发布了结构化工件(CSV/JSONL、日志、图表)和运行指标,以支持实际场景中的模型评估与系统规划。代码发布于 https://github.com/suraj-ranganath/PIPE-RDF。