Automated decision systems increasingly rely on human oversight to ensure accuracy in uncertain cases. This paper presents a practical framework for optimizing such human-in-the-loop classification systems using a double-threshold policy. Conventional classifiers usually produce a confidence score and apply a single cutoff, but our approach uses two thresholds (a lower and an upper) to automatically accept or reject high-confidence cases while routing ambiguous instances to human reviewers. We formulate this problem as an optimization task that balances system accuracy against the cost of human review. Through analytical derivations and Monte Carlo simulations, we show how different confidence score distributions impact the efficiency of human intervention and reveal regions of diminishing returns, where additional review yields minimal benefit. The framework provides a general, reproducible method for improving reliability in any decision pipeline requiring selective human validation, including applications in entity resolution, fraud detection, medical triage, and content moderation.
翻译:自动化决策系统日益依赖人工监督来确保不确定案例的准确性。本文提出了一种实用的框架,通过双阈值策略来优化此类人在回路的分类系统。传统分类器通常生成置信度分数并应用单一截止点,而我们的方法使用两个阈值(一个下限和一个上限)来自动接受或拒绝高置信度案例,同时将模糊实例转交给人工审核员。我们将此问题构建为一个优化任务,以平衡系统准确性与人工审核成本。通过解析推导和蒙特卡洛模拟,我们展示了不同置信度分数分布如何影响人工干预的效率,并揭示了收益递减区域,即额外审核带来的效益微乎其微。该框架为任何需要选择性人工验证的决策流程(包括实体解析、欺诈检测、医疗分诊和内容审核等应用)提供了一种通用、可复现的提高可靠性的方法。