Serving deep learning (DL) models on relational data has become a critical requirement across diverse commercial and scientific domains, sparking growing interest recently. In this visionary paper, we embark on a comprehensive exploration of representative architectures to address the requirement. We highlight three pivotal paradigms: The state-of-the-art DL-centric architecture offloads DL computations to dedicated DL frameworks. The potential UDF-centric architecture encapsulates one or more tensor computations into User Defined Functions (UDFs) within the relational database management system (RDBMS). The potential relation-centric architecture aims to represent a large-scale tensor computation through relational operators. While each of these architectures demonstrates promise in specific use scenarios, we identify urgent requirements for seamless integration of these architectures and the middle ground in-between these architectures. We delve into the gaps that impede the integration and explore innovative strategies to close them. We present a pathway to establish a novel RDBMS for enabling a broad class of data-intensive DL inference applications.
翻译:在关系数据上服务深度学习模型已成为跨商业与科学领域的迫切需求,近期引起了广泛关注。在这篇前瞻性论文中,我们对满足该需求的代表性架构进行了全面探讨。我们重点阐述了三种关键范式:最先进的以深度学习为中心的架构将深度学习计算卸载至专用深度学习框架;潜在的以用户定义函数为中心的架构将一个或多个张量计算封装至关系数据库管理系统内的用户定义函数中;潜在的以关系为中心的架构旨在通过关系运算符表示大规模张量计算。尽管这些架构在特定应用场景中均展现出潜力,但我们发现亟需实现这些架构间的无缝集成以及探索其间的折中方案。我们深入分析了阻碍集成的关键缺口,并探索了填补这些缺口的创新策略。我们提出了一条构建新型关系数据库管理系统的路径,以支持广泛的数据密集型深度学习推理应用。