Credit-card fraud detection is difficult because fraudulent transactions are rare, costly, and unevenly distributed. Strong gradient-boosted tree models already perform well on structured transaction data, so the value of another fusion method is not obvious. This paper examines whether Combinatorial Fusion Analysis (CFA), which searches over model subsets and rank-score fusion rules, can still add value on the IEEE-CIS Fraud Detection benchmark. Using a leakage-free 60/20/20 train/validation/test protocol, we evaluate 480 fusion configurations built from seven base classifiers. The best test-set result comes from diversity-weighted score fusion of Random Forest, XGBoost, and LightGBM (DEF WtScore), with AUC-ROC = 0.9405, AUPRC = 0.6699, and F1 = 0.6373. Bootstrap confidence intervals from 1,000 resamples show that the gains over the strongest single model exclude zero for all three metrics. CFA matches soft voting on AUC-ROC, improves AUPRC and F1, and outperforms stacking in this setting. A CTGAN augmentation experiment gives a negative result: synthetic fraud samples degrade both individual models and CFA. Overall, CFA is most useful here not as a way to combine every classifier, but as a validation-stage method for choosing a small, complementary subset and assigning diversity-aware weights.
翻译:信用卡欺诈检测因欺诈交易稀少、成本高昂且分布不均而颇具挑战。强梯度提升树模型已能在结构化交易数据上取得良好表现,因此另一种融合方法的价值并不明显。本文探讨了组合融合分析(CFA,该方法对模型子集和排名得分融合规则进行搜索)在IEEE-CIS欺诈检测基准测试中是否能额外提升性能。采用无泄漏的60/20/20训练/验证/测试方案,我们基于七个基分类器评估了480种融合配置。最佳测试集结果来自随机森林、XGBoost和LightGBM的多样性加权得分融合(DEF WtScore),其AUC-ROC为0.9405,AUPRC为0.6699,F1为0.6373。基于1,000次重抽样的自举置信区间表明,相较于最强单一模型,三项指标增益的置信区间均不包含零。CFA在AUC-ROC上匹配软投票,在AUPRC和F1上有所提升,并在该设定下优于堆叠方法。CTGAN增强实验给出负面结果:合成欺诈样本同时降低了单个模型和CFA的性能。总体而言,CFA在此场景下最有用的方式并非合并所有分类器,而是作为验证阶段方法,用于选择互补小模型子集并分配多样性感知权重。