Compliance-Scored Best-of-N Guardrail Orchestration for Multimodal Document Generation in Payments Dispute Defense

from arxiv, 8 pages, 7 figures, 4 tables. Preprint. Applied systems paper on compliance-scored guardrail orchestration for multimodal LLM document generation. Contains aggregate operational readouts; not a randomized A/B test

High-stakes enterprise document generation, including financial dispute narratives, compliance notices, and audit summaries, demands schema correctness, policy compliance, and low-latency operation at scale. Prior to a unified guardrail layer, production systems often stitched together separate PII redaction, content moderation, and format validation steps, leading to fragmented logic, slower request paths, and higher operational cost. We present a guardrail orchestration layer for text and image inputs that couples multi-candidate generation with an explicit compliance score used for early exit. The framework runs configurable parallel generation heads, scores candidates against weighted guardrails including PII detection, content moderation, schema constraints, and domain rules, and returns the best-scoring output with selection metadata. The available operational readout reports 5 attempts within 20 seconds and 91 percent compliance. For payments dispute defense summaries, we analyze aggregate operational scenario readouts rather than a randomized A/B test. Variable cohorts show higher count win rates than controls overall, 301/659 versus 536/1548, corresponding to +11.0 percentage points with 95 percent confidence interval [6.6, 15.5] and p < 0.001, and for adjusted item-not-received cases, +7.5 percentage points with 95 percent confidence interval [0.2, 15.7] and p = 0.045. Fraud and local evidence-ranking deltas are directionally positive but not statistically significant from the aggregate count data. We also report reviewer-calibrated Responsible-AI evidence-quality signals from 770 generated-evidence reviews and a 70-case OCR slice, and document the reproducibility boundary through the request interface, scoring logic, pseudocode, and operational evidence boundary.

翻译：高风险企业文档生成（包括金融争议陈述、合规通知及审计摘要）要求具备架构正确性、政策合规性及大规模低延迟运行能力。在统一护栏层出现之前，生产系统通常拼接独立的人为可识别信息脱敏、内容审核与格式验证步骤，导致逻辑碎片化、请求路径冗长及运营成本升高。我们提出一种面向文本与图像输入的护栏编排层，该层将多候选生成与用于提前退出的显式合规评分相结合。该框架运行可配置的并行生成分支，对候选结果进行加权护栏评分（包括人为可识别信息检测、内容审核、架构约束及领域规则），并返回包含选择元数据的最佳评分输出。可获取的运营读数报告显示：20秒内完成5次尝试，合规率达91%。针对支付争议辩护摘要，我们分析汇总的运营场景读数而非随机化A/B测试。可变群体总体呈现高于对照组的总胜率（301/659对比536/1548），对应95%置信区间[6.6, 15.5]、p<0.001下提升11.0个百分点；在调整后的未收件案例中，95%置信区间[0.2, 15.7]、p=0.045下提升7.5个百分点。欺诈与本地证据排序的差异方向积极，但未在汇总计数数据中达到统计显著性。我们还报告了来自770份生成证据审核及70例光学字符识别切片中经审阅者校准的负责任人工智能证据质量信号，并通过请求接口、评分逻辑、伪代码及运营证据边界记录了可复现性边界。