This study explores the application of anomaly detection (AD) methods in imbalanced learning tasks, focusing on fraud detection using real online credit card payment data. We assess the performance of several recent AD methods and compare their effectiveness against standard supervised learning methods. Offering evidence of distribution shift within our dataset, we analyze its impact on the tested models' performances. Our findings reveal that LightGBM exhibits significantly superior performance across all evaluated metrics but suffers more from distribution shifts than AD methods. Furthermore, our investigation reveals that LightGBM also captures the majority of frauds detected by AD methods. This observation challenges the potential benefits of ensemble methods to combine supervised, and AD approaches to enhance performance. In summary, this research provides practical insights into the utility of these techniques in real-world scenarios, showing LightGBM's superiority in fraud detection while highlighting challenges related to distribution shifts.
翻译:本研究探讨了异常检测方法在不平衡学习任务中的应用,重点关注基于真实在线信用卡支付数据的欺诈检测。我们评估了多种最新异常检测方法的性能,并将其与标准监督学习方法的有效性进行了比较。通过提供数据集内存在分布偏移的证据,我们分析了分布偏移对受测模型性能的影响。研究结果表明,LightGBM在所有评估指标上均表现出显著优越的性能,但相比异常检测方法更易受分布偏移的影响。此外,我们的研究发现LightGBM也能捕获异常检测方法所识别的大部分欺诈行为。这一发现对通过集成监督学习与异常检测方法提升性能的潜在优势提出了挑战。总体而言,本研究为这些技术在实际场景中的实用性提供了实践见解,揭示了LightGBM在欺诈检测中的优越性,同时强调了与分布偏移相关的挑战。