We introduce and analyse a simple probabilistic model of article production and citation behavior that explicitly assumes that there is no decline in citability of a given article over time. It makes predictions about the number and age of items appearing in the reference list of an article. The latter topics have been studied before, but only in the context of data, and to our knowledge no models have been presented. We then perform large-scale analyses of reference list length for a variety of academic disciplines. The results show that our simple model cannot be rejected, and indeed fits the aggregated data on reference lists rather well. Over the last few decades, the relationship between total publications and mean reference list length is linear to a high level of accuracy. Although our model is clearly an oversimplification, it will likely prove useful for further modeling of the scholarly literature. Finally, we connect our work to the large literature on "aging" or "obsolescence" of scholarly publications, and argue that the importance of that area of research is no longer clear, while much of the existing literature is confused and confusing.
翻译:我们提出并分析了一个简化的论文生产与引用行为概率模型,该模型明确假设给定论文的可引用性不会随时间衰减。该模型可以预测论文参考文献列表中的条目数量及文献年龄。此前虽已有相关主题的研究,但仅限于数据层面的讨论,据我们所知尚未有学者提出理论模型。我们随后对多个学科领域的参考文献列表长度开展了大规模分析。结果表明,该简化模型无法被证伪,且与参考文献列表的聚合数据拟合效果良好。近几十年来,论文总量与参考文献列表平均长度之间呈现高度线性关系。尽管我们的模型明显过于简化,但有望为学术文献的进一步建模提供参考。最后,我们将本研究与关于学术论文"老化"或"过时"的大量文献联系起来,认为该研究领域的重要性已不再明确,而现有文献中多数存在混淆模糊之处。