Scientific research trends and interests evolve over time. The ability to identify and forecast these trends is vital for educational institutions, practitioners, investors, and funding organizations. In this study, we predict future trends in scientific publications using heterogeneous sources, including historical publication time series from PubMed, research and review articles, pre-trained language models, and patents. We demonstrate that scientific topic popularity levels and changes (trends) can be predicted five years in advance across 40 years and 125 diverse topics, including life-science concepts, biomedical, anatomy, and other science, technology, and engineering topics. Preceding publications and future patents are leading indicators for emerging scientific topics. We find the ratio of reviews to original research articles informative for identifying increasing or declining topics, with declining topics having an excess of reviews. We find that language models provide improved insights and predictions into temporal dynamics. In temporal validation, our models substantially outperform the historical baseline. Our findings suggest that similar dynamics apply across other scientific and engineering research topics. We present SciTrends, a user-friendly webtool for predicting scientific topics trends: https://hadasakaufman.shinyapps.io/SciTrend
翻译:科学研究趋势和兴趣随时间演变。识别和预测这些趋势对于教育机构、从业者、投资者和资助组织至关重要。在本研究中,我们利用多种来源预测科学出版物的未来趋势,包括来自PubMed的历史出版物时间序列、研究与综述文章、预训练语言模型及专利。我们证明,在40年间跨越125个多样主题(涵盖生命科学概念、生物医学、解剖学及其他科学、技术与工程主题)的科学主题流行度水平及其变化(趋势)可提前五年进行预测。前期出版物和未来专利是新兴科学主题的领先指标。我们发现,综述文章与原创研究文章的比例对于识别上升或下降主题具有信息价值,其中下降主题的综述文章占比过高。我们还发现,语言模型能为时间动态特性提供更深入的洞察和预测。在时间验证中,我们的模型显著优于历史基线模型。研究结果表明,类似的动态特性也适用于其他科学与工程研究主题。我们推出了SciTrends——一个用于预测科学主题趋势的用户友好型网络工具:https://hadasakaufman.shinyapps.io/SciTrend