Practical employment of Bayesian trial designs is still rare. Even if accepted in principle, the regulators have commonly required that such designs be calibrated according to an upper bound for the frequentist type I error rate. This represents an internally inconsistent hybrid methodology, where important advantages from following the Bayesian principles are lost. In particular, all preplanned interim looks have an inflating multiplicity effect on type I error rate. To present an alternative approach, we consider the prototype case of a 2-arm superiority trial with dichotomous outcomes. The design is adaptive, using error control based on sequentially updated posterior probabilities, to conclude efficacy of the experimental treatment or futility of the trial. As gatekeepers for a proposed design, the regulators have the main responsibility in determining the parameters of the control of false positives, whereas the trial sponsors and investigators will have a natural role in specifying the criteria for stopping the trial due to futility. It is suggested that the traditional frequentist operating characteristics in the design, type I and type II error rates, be replaced, respectively, by Bayesian criteria called False Discovery Probability (FDP) and False Futility Probability (FFP), both terms corresponding directly to their probability interpretations. Importantly, the sequential error control during the data analysis based on posterior probabilities will satisfy these numerical criteria automatically, without need of preliminary computations before the trial is started. The method contains the option of applying a decision rule for terminating the trial early if the predicted costs from continuing would exceed the corresponding gains.
翻译:贝叶斯试验设计的实际应用仍然少见。即便在原则上被接受,监管机构通常要求此类设计需根据频率学派的I类错误率上限进行校准。这构成了一种内部不一致的混合方法论,导致遵循贝叶斯原则的重要优势丧失。特别是,所有预先计划的期中分析会对I类错误率产生膨胀的多重性效应。为提出替代方案,我们考虑一个以二分结局为特征的双臂优效性试验原型案例。该设计具有自适应特性,基于序贯更新的后验概率进行错误控制,以判定实验性治疗的有效性或试验的无效性。作为拟议设计的把关者,监管机构在确定假阳性控制参数方面承担主要责任,而试验赞助方与研究者则自然地在设定因无效性终止试验的标准中发挥重要作用。建议将传统频率学派设计中I类与II类错误率,分别替换为称为假发现概率(FDP)与假无效概率(FFP)的贝叶斯标准,这两个术语均直接对应其概率解释。重要的是,基于后验概率的数据分析过程中的序贯错误控制将自动满足这些数值标准,无需在试验启动前进行预先计算。该方法包含一项可选决策规则:若继续试验的预期成本超过相应收益,可提前终止试验。