Currently, there is limited research investigating the phenomenon of research data repositories being shut down, and the impact this has on the long-term availability of data. This paper takes an infrastructure perspective on the preservation of research data by using a registry to identify 191 research data repositories that have been closed and presenting information on the shutdown process. The results show that 6.2 % of research data repositories indexed in the registry were shut down. The risks resulting in repository shutdown are varied. The median age of a repository when shutting down is 12 years. Strategies to prevent data loss at the infrastructure level are pursued to varying extent. 44 % of the repositories in the sample migrated data to another repository, and 12 % maintain limited access to their data collection. However, both strategies are not permanent solutions. Finally, the general lack of information on repository shutdown events as well as the effect on the findability of data and the permanence of the scholarly record are discussed.
翻译:目前,针对研究数据存储库关闭现象及其对数据长期可用性影响的研究十分有限。本文从基础设施视角出发,利用注册库识别191个已关闭的研究数据存储库,并介绍关闭过程的相关信息。结果显示,该注册库中收录的研究数据存储库有6.2%已关闭,且导致存储库关闭的风险因素多种多样。存储库关闭时的中位年龄为12年。在基础设施层面防止数据丢失的策略实施程度不一:样本中44%的存储库将数据迁移至其他存储库,12%维持对数据集合的有限访问。然而,这两种策略均非长久之计。最后,本文讨论了存储库关闭事件相关信息普遍缺乏的现状,以及其对数据可发现性和学术记录永久性造成的影响。