Serving deep learning (DL) models on relational data has become a critical requirement across diverse commercial and scientific domains, sparking growing interest recently. In this visionary paper, we embark on a comprehensive exploration of representative architectures to address the requirement. We highlight three pivotal paradigms: The state-of-the-artDL-Centricarchitecture offloadsDL computations to dedicated DL frameworks. The potential UDF-Centric architecture encapsulates one or more tensor computations into User Defined Functions (UDFs) within the database system. The potentialRelation-Centricarchitecture aims to represent a large-scale tensor computation through relational operators. While each of these architectures demonstrates promise in specific use scenarios, we identify urgent requirements for seamless integration of these architectures and the middle ground between these architectures. We delve into the gaps that impede the integration and explore innovative strategies to close them. We present a pathway to establish a novel database system for enabling a broad class of data-intensive DL inference applications.
翻译:在关系数据上部署深度学习模型已成为众多商业和科学领域的迫切需求,近期引发了日益广泛的关注。在这篇前瞻性论文中,我们对解决该需求的代表性架构进行了全面探索。我们重点提出了三种关键范式:最先进的以深度学习为中心的架构将深度学习计算卸载至专用深度学习框架;潜在的用户定义函数中心架构将一种或多种张量计算封装为数据库系统中的用户定义函数;潜在的关系中心架构则旨在通过关系运算符表示大规模张量计算。尽管这些架构在特定应用场景中展现出潜力,但我们识别出对这些架构进行无缝整合以及寻求其折中方案的紧迫需求。我们深入剖析了阻碍整合的技术鸿沟,并探索了弥合这些鸿沟的创新策略。最终,我们提出了一条建立新型数据库系统的路径,以支持大规模数据密集型深度学习推理应用。