Over the past few years, a growing number of data platforms have emerged, including data commons, data repositories, and databases containing biomedical, environmental, social determinants of health and other data relevant to improving health outcomes. With the growing number of data platforms, interoperating multiple data platforms to form data meshes, data fabrics and other types of data ecosystems reduces data silos, expands data use, and increases the potential for new discoveries. In this paper, we introduce ten principles, which we call pillars, for data meshes. The goals of the principles are 1) to make it easier, faster, and more uniform to set up a data mesh from multiple data platforms; and, 2) to make it easier, faster, and more uniform, for a data platform to join one or more data meshes. The hope is that the greater availability of data through data meshes will accelerate research and that the greater uniformity of meshes will lower the cost of developing meshes and connecting a data platform to them.
翻译:过去几年中,涌现出越来越多的数据平台,包括包含生物医学、环境、健康社会决定因素以及其他与改善健康结果相关数据的数据共享平台、数据存储库和数据库。随着数据平台数量的增长,通过互操作多个数据平台形成数据网格、数据织物和其他类型的数据生态系统,可以减少数据孤岛、扩展数据使用并增加新发现的潜力。本文提出了数据网格的十项原则,我们称之为支柱。这些原则的目标是:1) 使从多个数据平台建立数据网格的过程更简单、更快速、更统一;2) 使数据平台加入一个或多个数据网格的过程更简单、更快速、更统一。希望通过数据网格提高数据的可用性,从而加速研究;同时,通过网格的更高统一性,降低开发网格以及将数据平台连接到网格的成本。