Replicability of Simulation Studies for the Investigation of Statistical Methods: The RepliSims Project

K. Luijken,A. Lohmann,U. Alter,J. Claramunt Gonzalez,F. J. Clouth,J. L. Fossum,L. Hesen,A. H. J. Huizing,J. Ketelaar,A. K. Montoya,L. Nab,R. C. C. Nijman,B. B. L. Penning de Vries,T. D. Tibbe,Y. A. Wang,R. H. H. Groenwold

from arxiv, 36 pages, 0 figures

Results of simulation studies evaluating the performance of statistical methods are often considered actionable and thus can have a major impact on the way empirical research is implemented. However, so far there is limited evidence about the reproducibility and replicability of statistical simulation studies. Therefore, eight highly cited statistical simulation studies were selected, and their replicability was assessed by teams of replicators with formal training in quantitative methodology. The teams found relevant information in the original publications and used it to write simulation code with the aim of replicating the results. The primary outcome was the feasibility of replicability based on reported information in the original publications. Replicability varied greatly: Some original studies provided detailed information leading to almost perfect replication of results, whereas other studies did not provide enough information to implement any of the reported simulations. Replicators had to make choices regarding missing or ambiguous information in the original studies, error handling, and software environment. Factors facilitating replication included public availability of code, and descriptions of the data-generating procedure and methods in graphs, formulas, structured text, and publicly accessible additional resources such as technical reports. Replicability of statistical simulation studies was mainly impeded by lack of information and sustainability of information sources. Reproducibility could be achieved for simulation studies by providing open code and data as a supplement to the publication. Additionally, simulation studies should be transparently reported with all relevant information either in the research paper itself or in easily accessible supplementary material to allow for replicability.

翻译：评估统计方法性能的仿真研究结果常被视为可直接应用的证据，因此对实证研究的实施方式具有重要影响。然而，目前关于统计仿真实验可再现性与可重复性的证据仍然有限。为此，我们选取了八篇高被引统计仿真研究，由具备定量方法学正规训练的复现团队对其可重复性进行评估。这些团队从原始文献中提取关键信息，编写仿真代码以复现原始结果。主要衡量指标是基于原始文献报告信息实现可重复的可行性。可重复性差异显著：部分原始研究提供了详尽信息，可实现近乎完美的结果复现；而另一些研究则因信息不足导致无法实施任何已报告的仿真实验。复现者需针对原始研究中缺失或模糊的信息、错误处理机制及软件环境做出自主选择。促进复现的关键因素包括开放代码、通过图表/公式/结构化文本描述数据生成过程及方法、公开可获取的补充资源（如技术报告）。统计仿真研究可重复性主要受限于信息缺失及信息来源的可持续性。通过在论文中附加公开的代码与数据，可实现仿真研究的可再现性。此外，仿真研究报告应保持透明，将所有相关信息纳入论文正文或易于获取的补充材料中，以保障可重复性。