Transformer models have continuously expanded into all machine learning domains convertible to the underlying sequence-to-sequence representation, including tabular data. However, while ubiquitous, this representation restricts their extension to the more general case of relational databases. In this paper, we introduce a modular neural message-passing scheme that closely adheres to the formal relational model, enabling direct end-to-end learning of tabular Transformers from database storage systems. We address the challenges of appropriate learning data representation and loading, which are critical in the database setting, and compare our approach against a number of representative models from various related fields across a significantly wide range of datasets. Our results demonstrate a superior performance of this newly proposed class of neural architectures.
翻译:Transformer模型已持续扩展至所有可转换为底层序列到序列表示形式的机器学习领域,包括表格数据。然而,尽管这种表示形式无处不在,它却限制了模型向更通用的关系型数据库场景的扩展。本文提出一种模块化神经消息传递方案,该方案严格遵循形式化的关系模型,能够直接从数据库存储系统中实现表格Transformer的端到端学习。我们解决了数据库场景下至关重要的学习数据表示与加载的挑战,并在极其广泛的数据集范围内,将本方法与多个相关领域的代表性模型进行了比较。实验结果表明,这类新提出的神经架构具有更优越的性能。