Data Mesh: a Systematic Gray Literature Review

Data mesh is an emerging domain-driven decentralized data architecture that aims to minimize or avoid operational bottlenecks associated with centralized, monolithic data architectures in enterprises. The topic has picked the practitioners' interest, and there is considerable gray literature on it. At the same time, we observe a lack of academic attempts at defining and building upon the concept. Hence, in this article, we aim to start from the foundations and characterize the data mesh architecture regarding its design principles, architectural components, capabilities, and organizational roles. We systematically collected, analyzed, and synthesized 114 industrial gray literature articles. The review provides insights into practitioners' perspectives on the four key principles of data mesh: data as a product, domain ownership of data, self-serve data platform, and federated computational governance. Moreover, due to the comparability of data mesh and SOA (service-oriented architecture), we mapped the findings from the gray literature into the reference architectures from the SOA academic literature to create the reference architectures for describing three key dimensions of data mesh: organization of capabilities and roles, development, and runtime. Finally, we discuss open research issues in data mesh, partially based on the findings from the gray literature.

翻译：数据网格是一种新兴的、领域驱动的去中心化数据架构，旨在最大限度地减少或避免企业中与集中式单体数据架构相关的运营瓶颈。该主题已引起从业者的广泛兴趣，并产生了大量相关的灰色文献。与此同时，我们观察到学术界在定义和构建这一概念方面尚缺乏尝试。因此，在本文中，我们旨在从基础出发，从设计原则、架构组件、能力及组织角色等方面对数据网格架构进行特征描述。我们系统性地收集、分析并综合了114篇工业灰色文献。本综述提供了从业者对数据网格四大核心原则的见解：数据即产品、数据的领域所有权、自助式数据平台以及联邦计算治理。此外，鉴于数据网格与面向服务架构（SOA）的可比性，我们将灰色文献中的发现映射到SOA学术文献中的参考架构，从而创建了用于描述数据网格三个关键维度的参考架构：能力与角色组织、开发以及运行时。最后，我们基于灰色文献的部分发现，探讨了数据网格中尚未解决的研究问题。

相关内容

面向服务的架构（SOA）

关注 10

面向服务的体系结构（Service-Oriented Architecture，SOA）是一个组件模型，它将应用程序的不同功能单元（称为服务）通过这些服务之间定义良好的接口和契约联系起来。接口是采用中立的方式进行定义的，它应该独立于实现服务的硬件平台、操作系统和编程语言。这使得构建在各种各样的系统中的服务可以使用一种统一和通用的方式进行交互。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日