Numerous metascience studies and other initiatives have begun to monitor the prevalence of open science practices when it is more important to understand the 'downstream' effects or impacts of open science. PLOS and DataSeer have developed a new LLM-based indicator to measure an important effect of open science: the reuse of research data. Our results show a data reuse rate of 43%, which is higher than established bibliometric techniques. We show that data reuse can be measured at scale using LLMs and generative artificial intelligence. The positive effects of research data sharing and reuse may currently be underestimated.
翻译:大量元科学研究及其他项目已开始监测开放科学实践的普及程度,而理解开放科学的"下游"效应或影响更为重要。PLOS与DataSeer开发了一种基于大语言模型的新指标,用于衡量开放科学的重要效应:研究数据的重用。我们的结果显示数据重用率为43%,高于现有文献计量技术所得数值。研究表明,利用大语言模型和生成式人工智能可大规模测量数据重用。目前研究数据共享与重用的积极效应可能被低估。