In law, regulatory regimes for pharmaceuticals and software security, newer authorities can revoke older established ones even when semantically distant. We call this CAR: retrieving the currently active authority frontier for a semantic anchor q, that is, front(cl(A_k(q))). This differs from finding the most similar document by relevance score: argmax_d s(q, d). Theorem 4 characterizes when a set R truly covers the active authority set for q with TCA(R, q)=1, providing conditions necessary and sufficient for any retrieved set R: frontier inclusion (front(cl(A_k(q))) contained in R) and no-ignored-superseder (no superseding document exists in the corpus outside R). Proposition 2 shows that TCA@k <= phi(q) * R_anchor(q) in the worst case over any scope-indexed algorithm, proved by an adversarial permutation argument. We evaluated on three real-world datasets: security advisories (Dense TCA@5=0.270, two-stage 0.975), SCOTUS overruling pairs (Dense TCA=0.172, two-stage 0.926), and FDA drug records (Dense TCA=0.064, two-stage 0.774). A GPT-4o-mini experiment shows Dense RAG produces explicit "not patched" claims for 39% of queries where a patch exists; two-stage cuts this to 16%. Four benchmark datasets, domain adapters, and a single-command scorer are released at https://github.com/andremir/car-retrieval.
翻译:在法律、药品及软件安全的监管体系中,较新的权威可推翻既有权威,即便两者语义上相距甚远。我们称此为CAR:针对语义锚点q检索当前活跃的权威前沿,即front(cl(A_k(q)))。这与通过相关性评分寻找最相似文档不同:argmax_d s(q, d)。定理4刻画了集合R真正覆盖q的活跃权威集且TCA(R, q)=1的条件,为任意检索集R提供了充分必要条件:前沿包含性(front(cl(A_k(q)))包含于R)与无忽略超越者(语料库中不存在R之外的超越性文档)。命题2证明,对于任意基于范围索引的算法,最坏情况下TCA@k <= φ(q) * R_anchor(q),该结果通过对抗性排列论证得出。我们在三个真实数据集上进行了评估:安全公告(稠密TCA@5=0.270,两阶段法0.975)、最高法院推翻案例对(稠密TCA=0.172,两阶段法0.926)及FDA药品记录(稠密TCA=0.064,两阶段法0.774)。GPT-4o-mini实验表明,在存在补丁的情况下,稠密RAG对39%的查询产生明确的“未修补”声明;两阶段法将此比例降至16%。四个基准数据集、领域适配器及单命令评分器已在https://github.com/andremir/car-retrieval 开源。