The financial industry relies on deep learning models for making important decisions. This adoption brings new danger, as deep black-box models are known to be vulnerable to adversarial attacks. In computer vision, one can shape the output during inference by performing an adversarial attack called poisoning via introducing a backdoor into the model during training. For sequences of financial transactions of a customer, insertion of a backdoor is harder to perform, as models operate over a more complex discrete space of sequences, and systematic checks for insecurities occur. We provide a method to introduce concealed backdoors, creating vulnerabilities without altering their functionality for uncontaminated data. To achieve this, we replace a clean model with a poisoned one that is aware of the availability of a backdoor and utilize this knowledge. Our most difficult for uncovering attacks include either additional supervised detection step of poisoned data activated during the test or well-hidden model weight modifications. The experimental study provides insights into how these effects vary across different datasets, architectures, and model components. Alternative methods and baselines, such as distillation-type regularization, are also explored but found to be less efficient. Conducted on three open transaction datasets and architectures, including LSTM, CNN, and Transformer, our findings not only illuminate the vulnerabilities in contemporary models but also can drive the construction of more robust systems.
翻译:金融行业依赖深度学习模型做出重要决策。这种应用带来了新的风险,因为深度黑盒模型已知容易受到对抗性攻击。在计算机视觉中,可以通过在训练期间向模型注入后门来实施一种称为投毒的对抗性攻击,从而在推理阶段影响输出。对于客户的金融交易序列,由于模型在更复杂的离散序列空间上运行,并且存在系统性的安全检查,因此插入后门更加困难。我们提出了一种方法来引入隐蔽的后门,在不改变其对未污染数据功能的情况下制造漏洞。为此,我们将一个干净模型替换为一个知晓后门可用性并利用这一知识的投毒模型。我们最难被发现的攻击包括在测试期间激活的附加监督检测步骤(针对投毒数据),或隐藏良好的模型权重修改。实验研究提供了这些效应如何在不同数据集、架构和模型组件中变化的见解。我们还探索了替代方法和基线(如蒸馏式正则化),但发现其效率较低。在三个公开交易数据集和架构(包括LSTM、CNN和Transformer)上进行的实验表明,我们的发现不仅揭示了当前模型中的漏洞,还能推动构建更鲁棒的系统。