Detecting point anomalies in bank account balances is essential for financial institutions, as it enables the identification of potential fraud, operational issues, or other irregularities. Robust statistics is useful for flagging outliers and for providing estimates of the data distribution parameters that are not affected by contaminated observations. However, such a strategy is often less efficient and computationally expensive under high dimensional setting. In this paper, we propose and evaluate empirically several robust approaches that may be computationally efficient in medium and high dimensional datasets, with high breakdown points and low computational time. Our application deals with around 2.6 million daily records of anonymous users' bank account balances.
翻译:检测银行账户余额中的点异常对金融机构至关重要,因为这有助于识别潜在的欺诈、操作问题或其他异常情况。稳健统计方法在标记异常值以及提供不受污染观测影响的数据分布参数估计方面具有优势。然而,在高维设置下,此类策略通常效率较低且计算成本高昂。本文提出并实证评估了几种稳健方法,这些方法在中高维数据集中可能具有计算效率,具备高崩溃点与低计算时间的特点。我们的应用涉及约260万条匿名用户银行账户余额的每日记录。