Fraud detection in payment networks relies on labels generated through heterogeneous and imperfect observation processes, yet existing approaches treat fraud as a homogeneous binary variable. We show that this assumption is structurally incorrect and leads to provable inefficiency. We introduce an observation-mechanism taxonomy that partitions fraud into five classes, each defined by a distinct censorship and labeling pipeline. We prove that estimating fraud rates separately by class and aggregating strictly dominates pooled estimation, with the efficiency gap characterized as a Jensen penalty arising from heterogeneous observation rates. For each class, we derive the binding theoretical constraint on detection, including endogenous label corruption, structural non-observability, and feature non-informativeness. These results establish that fraud detection is fundamentally a collection of distinct estimation problems, each governed by its own observation structure and detection limit.
翻译:支付网络中的欺诈检测依赖于通过异构且不完美的观测过程生成的标签,然而现有方法将欺诈视为同质的二值变量。我们证明这一假设在结构上是不正确的,并会导致可证明的低效性。我们引入一种观测机制分类法,将欺诈划分为五类,每类由不同的审查和标注流程定义。我们证明,按类别分别估计欺诈率并汇总,严格优于混合估计,效率差距由源于异质观测率的詹森惩罚刻画。针对每类,我们推导出检测的约束理论极限,包括内生的标签污染、结构性不可观测性和特征非信息性。这些结果表明,欺诈检测本质上是不同估计问题的集合,每个问题受其自身观测结构和检测极限支配。