Money laundering is a profound global problem. Nonetheless, there is little scientific literature on statistical and machine learning methods for anti-money laundering. In this paper, we focus on anti-money laundering in banks and provide an introduction and review of the literature. We propose a unifying terminology with two central elements: (i) client risk profiling and (ii) suspicious behavior flagging. We find that client risk profiling is characterized by diagnostics, i.e., efforts to find and explain risk factors. On the other hand, suspicious behavior flagging is characterized by non-disclosed features and hand-crafted risk indices. Finally, we discuss directions for future research. One major challenge is the need for more public data sets. This may potentially be addressed by synthetic data generation. Other possible research directions include semi-supervised and deep learning, interpretability, and fairness of the results.
翻译:洗钱是一个深远的全球性问题。然而,关于反洗钱统计和机器学习方法的科学文献却寥寥无几。本文聚焦于银行领域的反洗钱实践,对该领域的文献进行了导论性和综述性梳理。我们提出了一个统一的术语体系,包含两个核心要素:(i)客户风险画像与(ii)可疑行为标记。我们发现,客户风险画像的特点在于诊断性,即致力于发现和解释风险因素;而可疑行为标记的特点则在于使用未公开特征与人工构建的风险指数。最后,我们讨论了未来研究方向。主要挑战之一在于需要更多公开数据集,这或可通过合成数据生成来应对。其他可能方向包括半监督学习与深度学习、结果的可解释性及公平性。