The rapid pace of scientific publishing has made the identification and synthesis of high-impact work an increasingly urgent challenge. We introduce MIRAI (Multi-year Inference of Research trends and Academic Impact), a deep learning framework that predicts paper impact using only it's title, abstract, and publication date. We train MIRAI on the arXiv academic graph to predict 5-year PageRank and citation counts, achieving Spearman's $ρ$ of 0.4686 on PageRank prediction and 0.6192 on citation prediction for papers published in 2021. We propose a research ideation pipeline built on top of MIRAI that produces research ideas oriented towards high impact. These ideas were judged as more impactful than a baseline without MIRAI by an unbiased LLM judge at a 4:3 ratio. We make the 5-year citation prediction model publicly available at https://predict-paper-impact.vercel.app.
翻译:科学出版物的快速涌现使得识别与整合高影响力工作成为日益紧迫的挑战。我们提出MIRAI(多年度研究与学术影响力推断),一个仅利用论文标题、摘要和发表日期即可预测其影响力的深度学习框架。我们在arXiv学术图谱上训练MIRAI,预测论文的五年PageRank值和被引次数:对2021年发表论文的PageRank预测斯皮尔曼ρ达0.4686,被引预测达0.6192。我们提出基于MIRAI的研究构思流水线,可生成面向高影响力的研究设想。经无偏LLM评委以4:3的比例评判,这些设想被认为比未使用MIRAI的基线方案更具影响力。我们已将五年被引预测模型公开提供于https://predict-paper-impact.vercel.app。