Serving deep learning (DL) models on relational data has become a critical requirement across diverse commercial and scientific domains, sparking growing interest recently. In this visionary paper, we embark on a comprehensive exploration of representative architectures to address the requirement. We highlight three pivotal paradigms: The state-of-the-artDL-Centricarchitecture offloadsDL computations to dedicated DL frameworks. The potential UDF-Centric architecture encapsulates one or more tensor computations into User Defined Functions (UDFs) within the database system. The potentialRelation-Centricarchitecture aims to represent a large-scale tensor computation through relational operators. While each of these architectures demonstrates promise in specific use scenarios, we identify urgent requirements for seamless integration of these architectures and the middle ground between these architectures. We delve into the gaps that impede the integration and explore innovative strategies to close them. We present a pathway to establish a novel database system for enabling a broad class of data-intensive DL inference applications.
翻译:在关系数据上部署深度学习(DL)模型已成为众多商业和科学领域的核心需求,近期引发了广泛关注。在这篇前瞻性论文中,我们全面探索了满足该需求的代表性架构。我们重点阐述了三个关键范式:最先进的"以DL为中心"架构将DL计算卸载至专用DL框架;潜在的"以UDF为中心"架构将一项或多项张量计算封装为数据库系统中的用户自定义函数(UDF);潜在的"以关系为中心"架构则旨在通过关系算子表示大规模张量计算。尽管每种架构在特定应用场景中均展现出潜力,但我们发现迫切需要实现这些架构的无缝集成,并探索其间的折中方案。我们深入剖析了阻碍集成的技术鸿沟,并研究了弥合这些鸿沟的创新策略。最终提出了一条构建新型数据库系统的路径,以支持大规模数据密集型DL推理应用。