Compliance at web scale poses practical challenges: each request may require a regulatory assessment. Regulatory texts (e.g., the General Data Protection Regulation, GDPR) are cross-referential and normative, while runtime contexts are expressed in unstructured natural language. This setting motivates us to align semantic information in unstructured text with the structured, normative elements of regulations. To this end, we introduce GraphCompliance, a framework that represents regulatory texts as a Policy Graph and runtime contexts as a Context Graph, and aligns them. In this formulation, the policy graph encodes normative structure and cross-references, whereas the context graph formalizes events as subject-action-object (SAO) and entity-relation triples. This alignment anchors the reasoning of a judge large language model (LLM) in structured information and helps reduce the burden of regulatory interpretation and event parsing, enabling a focus on the core reasoning step. In experiments on 300 GDPR-derived real-world scenarios spanning five evaluation tasks, GraphCompliance yields 4.1-7.2 percentage points (pp) higher micro-F1 than LLM-only and RAG baselines, with fewer under- and over-predictions, resulting in higher recall and lower false positive rates. Ablation studies indicate contributions from each graph component, suggesting that structured representations and a judge LLM are complementary for normative reasoning.
翻译:网络规模的合规性实践面临挑战:每个请求都可能需要进行法规评估。法规文本(例如《通用数据保护条例》,GDPR)具有交叉引用和规范性,而运行时上下文则以非结构化自然语言表达。这一背景促使我们将非结构化文本中的语义信息与法规的结构化、规范性要素对齐。为此,我们提出了GraphCompliance框架,该框架将法规文本表示为政策图,将运行时上下文表示为上下文图,并对两者进行对齐。在此表述中,政策图编码了规范结构和交叉引用,而上下文图则将事件形式化为主-动-宾(SAO)三元组和实体-关系三元组。这种对齐将评判大语言模型(LLM)的推理过程锚定于结构化信息,有助于减轻法规解释和事件解析的负担,使其能够专注于核心推理步骤。在基于GDPR衍生的300个真实场景、涵盖五项评估任务的实验中,GraphCompliance的微平均F1分数比纯LLM和RAG基线高出4.1-7.2个百分点,且欠预测和过预测更少,从而实现了更高的召回率和更低的误报率。消融研究表明每个图组件均有所贡献,表明结构化表示与评判LLM在规范性推理中具有互补性。