This paper argues that large ML conferences should allocate marginal review capacity primarily to papers near the acceptance boundary, rather than spreading extra reviews via random or affinity-driven heuristics. We propose using LLM-based comparative ranking (via pairwise comparisons and a Bradley--Terry model) to identify a borderline band \emph{before} human reviewing and to allocate \emph{marginal} reviewer capacity at assignment time. Concretely, given a venue-specific minimum review target (e.g., 3 or 4), we use this signal to decide which papers receive one additional review (e.g., a 4th or 5th), without conditioning on any human reviews and without using LLM outputs for accept/reject. We provide a simple expected-impact calculation in terms of (i) the overlap between the predicted and true borderline sets ($ρ$) and (ii) the incremental value of an extra review near the boundary ($Δ$), and we provide retrospective proxies to estimate these quantities.
翻译:本文主张,大型机器学习会议应将边际审稿资源主要分配给处于录用边界附近的论文,而非通过随机或亲和度驱动的启发式方法分散额外审稿。我们提出,在人工审稿开始前,利用基于LLM的比较排序(通过成对比较与Bradley--Terry模型)识别边界论文带,并在审稿分配阶段据此分配边际审稿资源。具体而言,在给定会议特定的最低审稿数量要求(例如3或4篇)后,我们利用该信号决定哪些论文应获得额外一篇审稿(例如第4或第5篇),此决策不依赖于任何人工审稿结果,且不将LLM输出直接用于录用/拒稿决定。我们提供了一个简单的预期影响计算框架,该框架基于(i)预测边界集与真实边界集的重合度($ρ$)以及(ii)边界附近额外审稿的增量价值($Δ$),并提供了用于估计这些量的回顾性代理指标。