Custom officials across the world encounter huge volumes of transactions. With increased connectivity and globalization, the customs transactions continue to grow every year. Associated with customs transactions is the customs fraud - the intentional manipulation of goods declarations to avoid the taxes and duties. With limited manpower, the custom offices can only undertake manual inspection of a limited number of declarations. This necessitates the need for automating the customs fraud detection by machine learning (ML) techniques. Due the limited manual inspection for labeling the new-incoming declarations, the ML approach should have robust performance subject to the scarcity of labeled data. However, current approaches for customs fraud detection are not well suited and designed for this real-world setting. In this work, we propose $\textbf{GraphFC}$ ($\textbf{Graph}$ neural networks for $\textbf{C}$ustoms $\textbf{F}$raud), a model-agnostic, domain-specific, semi-supervised graph neural network based customs fraud detection algorithm that has strong semi-supervised and inductive capabilities. With upto 252% relative increase in recall over the present state-of-the-art, extensive experimentation on real customs data from customs administrations of three different countries demonstrate that GraphFC consistently outperforms various baselines and the present state-of-art by a large margin.
翻译:全球海关官员面临海量交易处理任务。随着互联互通与全球化进程加速,海关交易量逐年递增。海关欺诈行为——即通过蓄意篡改货物申报信息以逃避关税的违法行为——与交易活动相伴而生。受限于人力资源,海关部门仅能对有限数量的申报单进行人工查验。这一现实亟需运用机器学习(ML)技术实现海关欺诈检测的自动化。由于人工标注新进申报单的能力有限,机器学习方法需在标注数据稀缺条件下仍保持稳健性能。然而,现有海关欺诈检测方法对此实际场景的适配性与设计存在不足。本文提出$\textbf{GraphFC}$(基于图神经网络的$\textbf{Customs}$海关$\textbf{Fraud}$欺诈检测算法),这是一种模型无关、领域特化且具备强半监督与归纳能力的图神经网络欺诈检测算法。基于来自三个不同国家海关管理部门真实数据的广泛实验表明,GraphFC较现有最优方法在召回率上最高提升252%,并在所有基线模型及当前最优方案中持续展现显著性能优势。