While there has been extensive work on deep neural networks for images and text, deep learning for relational databases (RDBs) is still a rather unexplored field. One direction that recently gained traction is to apply Graph Neural Networks (GNNs) to RBDs. However, training GNNs on large relational databases (i.e., data stored in multiple database tables) is rather inefficient due to multiple rounds of training and potentially large and inefficient representations. Hence, in this paper we propose SPARE (Single-Pass Relational models), a new class of neural models that can be trained efficiently on RDBs while providing similar accuracies as GNNs. For enabling efficient training, different from GNNs, SPARE makes use of the fact that data in RDBs has a regular structure, which allows one to train these models in a single pass while exploiting symmetries at the same time. Our extensive empirical evaluation demonstrates that SPARE can significantly speedup both training and inference while offering competitive predictive performance over numerous baselines.
翻译:尽管针对图像和文本的深度神经网络已有大量研究工作,但深度学习在关系数据库领域仍是一个相对未被探索的方向。近期一个受到关注的研究方向是将图神经网络应用于关系数据库。然而,在大型关系数据库(即存储在多个数据库表中的数据)上训练图神经网络效率较低,原因在于需要多轮训练以及可能产生庞大且低效的表示。因此,本文提出SPARE(单遍关系模型)——一类新型神经模型,能够在关系数据库上高效训练,同时提供与图神经网络相近的准确率。与图神经网络不同,SPARE利用关系数据库中数据的规则结构实现高效训练,这使得模型能够在单遍训练中同时利用对称性。我们的大量实验评估表明,SPARE能够在显著加速训练和推理的同时,在多个基线上提供具有竞争力的预测性能。