Assessing research novelty is a core yet highly subjective aspect of peer review, typically based on implicit judgment and incomplete comparison to prior work. We introduce a literature-aware novelty assessment framework that explicitly learns how humans judge novelty from peer-review reports and grounds these judgments in structured comparison to existing research. Using nearly 80K novelty-annotated reviews from top-tier AI conferences, we fine-tune a large language model to capture reviewer-aligned novelty evaluation behavior. For a given manuscript, the system extracts structured representations of its ideas, methods, and claims, retrieves semantically related papers, and constructs a similarity graph that enables fine-grained, concept-level comparison to prior work. Conditioning on this structured evidence, the model produces calibrated novelty scores and human-like explanatory assessments, reducing overestimation and improving consistency relative to existing approaches.
翻译:评估研究新颖性是同行评议的核心环节,但具有高度主观性,通常依赖于隐含判断及与已有研究的不完全比较。本文提出一种文献感知的新颖性评估框架,该框架通过显式学习人类如何从同行评议报告中判断新颖性,并将这些判断建立在与现有研究的结构化比较基础上。利用从顶级人工智能会议收集的近8万条带有新颖性标注的评议报告,我们对一个大语言模型进行微调,以捕捉与评审人一致的新颖性评估行为。对于给定稿件,系统提取其思想、方法与主张的结构化表征,检索语义相关的论文,并构建相似性图谱,从而实现与先前工作在概念层面的细粒度比较。基于此类结构化证据,模型生成经过校准的新颖性评分及类人化的解释性评估,相较于现有方法,有效降低了高估倾向并提升了评估一致性。