We present PyTorch Frame, a PyTorch-based framework for deep learning over multi-modal tabular data. PyTorch Frame makes tabular deep learning easy by providing a PyTorch-based data structure to handle complex tabular data, introducing a model abstraction to enable modular implementation of tabular models, and allowing external foundation models to be incorporated to handle complex columns (e.g., LLMs for text columns). We demonstrate the usefulness of PyTorch Frame by implementing diverse tabular models in a modular way, successfully applying these models to complex multi-modal tabular data, and integrating our framework with PyTorch Geometric, a PyTorch library for Graph Neural Networks (GNNs), to perform end-to-end learning over relational databases.
翻译:本文提出PyTorch Frame,一个基于PyTorch的多模态表格数据深度学习框架。该框架通过以下方式简化表格深度学习:提供基于PyTorch的数据结构以处理复杂表格数据;引入模型抽象机制实现表格模型的模块化构建;支持集成外部基础模型处理复杂列类型(例如采用大语言模型处理文本列)。我们通过模块化实现多种表格模型、在多模态表格数据上成功应用这些模型,并将本框架与图神经网络库PyTorch Geometric集成以实现关系数据库端到端学习,从而验证了PyTorch Frame的实用价值。