Money laundering is a profound global problem. Nonetheless, there is little scientific literature on statistical and machine learning methods for anti-money laundering. In this paper, we focus on anti-money laundering in banks and provide an introduction and review of the literature. We propose a unifying terminology with two central elements: (i) client risk profiling and (ii) suspicious behavior flagging. We find that client risk profiling is characterized by diagnostics, i.e., efforts to find and explain risk factors. On the other hand, suspicious behavior flagging is characterized by non-disclosed features and hand-crafted risk indices. Finally, we discuss directions for future research. One major challenge is the need for more public data sets. This may potentially be addressed by synthetic data generation. Other possible research directions include semi-supervised and deep learning, interpretability, and fairness of the results.
翻译:洗钱是一个严重的全球性问题。然而,关于反洗钱的统计与机器学习方法的科学文献却相对匮乏。本文聚焦于银行领域的反洗钱实践,对该领域的文献进行介绍与综述。我们提出了一套统一的术语体系,包含两个核心要素:(一)客户风险画像;(二)可疑行为标记。研究发现,客户风险画像的特征在于诊断分析,即识别并解释风险因素的努力;而可疑行为标记则体现为非公开特征与人工构建的风险指数。最后,我们探讨了未来研究方向。主要挑战之一是缺乏公开数据集,合成数据生成技术可能为解决此问题提供潜在方案。其他可能的研究方向包括半监督学习与深度学习、结果的可解释性及公平性。