Objective: The objective was to develop a cloud-based, federated system to serve as a single point of search, discovery and analysis for data generated under the NIH Helping to End Addiction Long-term (HEAL) Initiative. Materials and methods: The HEAL Data Platform is built on the open source Gen3 platform, utilizing a small set of framework services and exposed APIs to interoperate with both NIH and non-NIH data repositories. Framework services include those for authentication and authorization, creating persistent identifiers for data objects, and adding and updating metadata. Results: The HEAL Data Platform serves as a single point of discovery of over one thousand studies funded under the HEAL Initiative. With hundreds of users per month, the HEAL Data Platform provides rich metadata and interoperates with data repositories and commons to provide access to shared datasets. Secure, cloud-based compute environments that are integrated with STRIDES facilitate secondary analysis of HEAL data. The HEAL Data Platform currently interoperates with nineteen data repositories. Discussion: Studies funded under the HEAL Initiative generate a wide variety of data types, which are deposited across multiple NIH and third-party data repositories. The mesh architecture of the HEAL Data Platform provides a single point of discovery of these data resources, accelerating and facilitating secondary use. Conclusion: The HEAL Data Platform enables search, discovery, and analysis of data that are deposited in connected data repositories and commons. By ensuring that these data are fully Findable, Accessible, Interoperable and Reusable (FAIR), the HEAL Data Platform maximizes the value of data generated under the HEAL Initiative.
翻译:目的:旨在开发一个基于云的联邦系统,作为美国国立卫生研究院(NIH)“助力终结成瘾长期计划”(HEAL)所生成数据的统一检索、发现与分析入口。材料与方法:HEAL数据平台基于开源Gen3平台构建,利用一组核心框架服务与开放API,实现与NIH及非NIH数据存储库的互操作。框架服务包括身份验证与授权、为数据对象创建持久标识符以及添加与更新元数据等功能。结果:HEAL数据平台作为统一发现入口,汇集了HEAL计划资助的千余项研究。平台每月服务数百名用户,提供丰富的元数据,并通过与数据存储库及数据共享空间的互操作实现共享数据集的访问。与STRIDES集成的安全云计算环境为HEAL数据的二次分析提供支持。目前HEAL数据平台已实现与十九个数据存储库的互操作。讨论:HEAL计划资助的研究产生多种数据类型,这些数据存储于多个NIH及第三方数据存储库。HEAL数据平台的网状架构为这些数据资源提供了统一发现入口,促进并加速了数据的二次利用。结论:HEAL数据平台支持对关联数据存储库及共享空间中数据的检索、发现与分析。通过确保数据完全符合可发现、可访问、可互操作、可重用(FAIR)原则,该平台最大限度地提升了HEAL计划所生成数据的价值。