The exponential growth in scientific publications poses a severe challenge for human researchers. It forces attention to more narrow sub-fields, which makes it challenging to discover new impactful research ideas and collaborations outside one's own field. While there are ways to predict a scientific paper's future citation counts, they need the research to be finished and the paper written, usually assessing impact long after the idea was conceived. Here we show how to predict the impact of onsets of ideas that have never been published by researchers. For that, we developed a large evolving knowledge graph built from more than 21 million scientific papers. It combines a semantic network created from the content of the papers and an impact network created from the historic citations of papers. Using machine learning, we can predict the dynamic of the evolving network into the future with high accuracy, and thereby the impact of new research directions. We envision that the ability to predict the impact of new ideas will be a crucial component of future artificial muses that can inspire new impactful and interesting scientific ideas.
翻译:科学出版物的指数级增长对人类研究者构成了严峻挑战,迫使人们关注更狭窄的子领域,使得发现本领域之外具有影响力的新研究思路与跨领域合作变得困难。尽管已有方法可预测科学论文的未来被引次数,但这些方法需要研究完成且论文撰写完毕,通常是在研究构想诞生很久之后才能评估其影响力。在此,我们展示了如何预测研究者尚未发表过的新构想萌芽的影响力。为此,我们构建了一个基于2100余万篇科学论文的大型演化知识图谱。该图谱融合了基于论文内容创建的语义网络与基于历史被引数据创建的影响力网络。通过机器学习,我们能够高精度地预测该演化网络未来的动态变化,进而预测新研究方向的影响力。我们展望,这种预测新构想影响力的能力将成为未来"人工智能缪斯"的关键组成部分,助力启发具有影响力且富有创意的科学新思路。