Adaptive context selection is critical for retrieval-augmented generation (RAG) systems, as fixed Top-K retrieval fails under query-dependent and heavy-tailed similarity distributions. While Extreme Value Theory (EVT) offers a principled framework for adaptive truncation, existing approaches apply EVT globally across the entire ranked list, incurring prohibitive computational costs and statistical instability. We propose Tail-Aware Adaptive-k(TAA-k), a training-free framework that operationalizes EVT through a localized validation strategy. The key insight is that ranked similarity curves exhibit a characteristic steep--flat--steep pattern reflecting a transition from relevance-dominated to noise-dominated regimes. TAA-k exploits this geometric structure via knee detection to identify a compact candidate region, then applies EVT-based goodness-of-fit testing within this window to validate the onset of tail behavior. This coarse-to-fine design reduces computational complexity from O(N^2M) to O(sqrt{N\log N}*M) while maintaining statistical rigor. Under mild monotone likelihood ratio assumptions, TAA-k yields a stable, query-adaptive cutoff corresponding to the earliest noise-dominated position. Experiments on WebQuestions, 2WikiMultiHopQA, and MuSiQue demonstrate that TAA-k achieves near-oracle retrieval quality (F1 within 2-3% of oracle) with orders-of-magnitude efficiency gains over global EVT methods, while maintaining robustness across embedding models and compression dimensions.
翻译:自适应上下文选择对于检索增强生成(RAG)系统至关重要,因为固定Top-K检索在查询依赖和重尾相似度分布下会失效。尽管极值理论(EVT)提供了自适应截断的理论框架,但现有方法将EVT全局应用于整个排序列表,导致高昂的计算成本和统计不稳定性。我们提出自适应尾感知k值(TAA-k),这是一种无需训练的新框架,通过局部化验证策略实现EVT。关键洞察在于:排序相似度曲线呈现典型的“陡峭-平坦-陡峭”模式,反映了从相关性主导到噪声主导的转变。TAA-k通过膝盖检测利用这一几何结构识别紧凑候选区域,然后在窗口内应用基于EVT的拟合优度检验验证尾行为的起始点。这种由粗到精的设计将计算复杂度从O(N²M)降低到O(√(N log N)·M),同时保持统计严谨性。在温和的单调似然比假设下,TAA-k可生成稳定的查询自适应截断点,对应最早噪声主导位置。在WebQuestions、2WikiMultiHopQA和MuSiQue数据集上的实验表明,TAA-k达到了近oracle的检索质量(F1分数与oracle仅差2-3%),相比全局EVT方法实现了数量级的效率提升,同时在嵌入模型和压缩维度下保持鲁棒性。