When different researchers study the same research question using the same dataset they may obtain different and potentially even conflicting results. This is because there is often substantial flexibility in researchers' analytical choices, an issue also referred to as ''researcher degrees of freedom''. Combined with selective reporting of the smallest p-value or largest effect, researcher degrees of freedom may lead to an increased rate of false positive and overoptimistic results. In this paper, we address this issue by formalizing the multiplicity of analysis strategies as a multiple testing problem. As the test statistics of different analysis strategies are usually highly dependent, a naive approach such as the Bonferroni correction is inappropriate because it leads to an unacceptable loss of power. Instead, we propose using the ''minP'' adjustment method, which takes potential test dependencies into account and approximates the underlying null distribution of the minimal p-value through a permutation-based procedure. This procedure is known to achieve more power than simpler approaches while ensuring a weak control of the family-wise error rate. We illustrate our approach for addressing researcher degrees of freedom by applying it to a study on the impact of perioperative paO2 on post-operative complications after neurosurgery. A total of 48 analysis strategies are considered and adjusted using the minP procedure. This approach allows to selectively report the result of the analysis strategy yielding the most convincing evidence, while controlling the type 1 error -- and thus the risk of publishing false positive results that may not be replicable.
翻译:当不同研究者使用相同数据集研究同一问题时,可能获得不同甚至相互矛盾的结果。这是因为研究者的分析选择往往具有显著灵活性,这一问题被称为"研究者自由度"。当研究者选择性报告最小p值或最大效应时,研究者自由度可能导致假阳性率和过度乐观结果增加。本文通过将分析策略的多样性形式化为多重检验问题来应对这一挑战。由于不同分析策略的检验统计量通常高度相关,采用Bonferroni校正等简单方法并不合适,因其会导致不可接受的统计效能损失。我们转而提出采用"minP"调整方法,该方法能考虑潜在检验依赖性,并通过基于置换的程序逼近最小p值的原假设分布。该程序在确保弱控制族系错误率的同时,能比简单方法获得更高统计效能。我们通过将其应用于围术期paO2对神经外科术后并发症影响的研究,展示了应对研究者自由度问题的具体方案。共纳入48种分析策略,并采用minP程序进行调整。该方法允许研究者选择性报告产生最显著证据的分析策略结果,同时控制第一类错误——从而降低发表可能无法复现的假阳性结果的风险。