Value Added Tax (VAT) fraud erodes public revenue and puts legitimate businesses at a disadvantaged position thereby impacting inequality. Identifying and combating VAT fraud before it occurs is therefore important for welfare. This paper proposes flexible machine learning algorithms which detect fraudulent transactions, utilising the information provided by the complex VAT network structure of a large dimension. VAT fraud detection is implemented through a combination of a suitably constructed Laplacian matrix with classification algorithms that rely on scalable machine learning techniques. The method is implemented on the universe of Bulgarian VAT data and detects around 50 percent of the VAT fraud, outperforming well-known techniques that ignore the information provided by the network of VAT transactions. Importantly, the proposed methods are automated, and can be implemented following the taxpayers submission of their VAT returns. This allows tax revenue authorities to prevent large losses of tax revenues through performing early identification of fraud between business-to-business transactions within the VAT system.
翻译:增值税(VAT)欺诈侵蚀公共财政收入,并使合规企业处于不利地位,从而加剧不平等现象。因此,在欺诈发生前识别并打击增值税欺诈对公共福利至关重要。本文提出了灵活的机器学习算法,利用大规模增值税交易网络结构提供的复杂信息来检测欺诈交易。该方法通过将适当构造的拉普拉斯矩阵与依赖可扩展机器学习技术的分类算法相结合,实现了增值税欺诈检测。该算法在保加利亚增值税交易全量数据上实施,可检测出约50%的增值税欺诈行为,优于忽略增值税交易网络信息的主流方法。重要的是,所提出的方法可实现自动化,并可在纳税人提交增值税申报后立即执行。这使税收征管部门能够通过对增值税系统中企业间交易进行早期欺诈识别,防止大规模税收收入损失。