Large Language Models (LLMs) have demonstrated strong capabilities in interpreting lengthy, complex legal and policy language. However, their reliability can be undermined by hallucinations and inconsistencies, particularly when analyzing subjective and nuanced documents. These challenges are especially critical in medical coverage policy review, where human experts must be able to rely on accurate information. In this paper, we present an approach designed to support human reviewers by making policy interpretation more efficient and interpretable. We introduce a methodology that pairs a coverage-aware retriever with symbolic rule-based reasoning to surface relevant policy language, organize it into explicit facts and rules, and generate auditable rationales. This hybrid system minimizes the number of LLM inferences required which reduces overall model cost. Notably, our approach achieves a 44% reduction in inference cost alongside a 4.5% improvement in F1 score, demonstrating both efficiency and effectiveness.
翻译:大型语言模型(LLM)在解读冗长复杂的法律与政策语言方面展现出强大能力。然而,其可靠性可能因幻觉与不一致性而受到削弱,尤其在分析主观且微妙的文件时。这些挑战在医疗覆盖政策审查中尤为关键,因为人类专家必须依赖准确的信息。本文提出一种旨在通过提高政策解读效率与可解释性来辅助人工审查的方法。我们引入一种将覆盖范围感知检索器与基于符号规则的推理相结合的方法,以提取相关政策条款、将其组织为明确的事实与规则,并生成可审计的推理依据。该混合系统最大限度地减少了所需的LLM推理次数,从而降低了整体模型成本。值得注意的是,我们的方法实现了推理成本降低44%,同时F1分数提升4.5%,证明了其在效率与效能上的双重优势。