Statistics is sometimes described as the science of reasoning under uncertainty. Statistical models provide one view of this uncertainty, but what is frequently neglected is the invisible portion of uncertainty: that assumed not to exist once a model has been fitted to some data. Systematic errors, i.e. bias, in data relative to some model and inferential goal can seriously undermine research conclusions, and qualitative and quantitative techniques have been created across several disciplines to quantify and generally appraise such potential biases. Perhaps best known are so-called risk of bias assessment instruments used to investigate the likely quality of randomised controlled trials in medical research. However, the logic of assessing the risks caused by various types of systematic error to statistical arguments applies far more widely. This logic applies even when statistical adjustment strategies for potential biases are used, as these frequently make assumptions (e.g. data missing at random) that can never be guaranteed in finite samples. Mounting concern about such situations can be seen in the increasing calls for greater consideration of biases caused by nonprobability sampling in descriptive inference (i.e. survey sampling), and the statistical generalisability of in-sample causal effect estimates in causal inference; both of which relate to the consideration of model-based and wider uncertainty when presenting research conclusions from models. Given that model-based adjustments are never perfect, we argue that qualitative risk of bias reporting frameworks for both descriptive and causal inferential arguments should be further developed and made mandatory by journals and funders. It is only through clear statements of the limits to statistical arguments that consumers of research can fully judge their value for any specific application.
翻译:统计学有时被描述为在不确定性下进行推理的科学。统计模型提供了这种不确定性的一个视角,但经常被忽视的是不确定性的隐形部分:即模型拟合数据后假定不存在的部分。相对于某些模型和推断目标的数据系统性误差(即偏倚),可能严重破坏研究结论。多个学科已经创建了定性和定量技术来量化并全面评估此类潜在偏倚。最为人所知的或许是用于调查医学研究中随机对照试验可能质量的所谓偏倚风险评估工具。然而,评估各类系统性误差对统计论证所带来风险的逻辑,适用范围远不止于此。即使采用统计调整策略来应对潜在偏倚时,这一逻辑依然适用,因为这些策略常常依赖于无法在有限样本中得到保证的假设(例如数据完全随机缺失)。对此类情况的日益关注,体现在对描述性推断(即抽样调查)中非概率抽样导致的偏倚,以及因果推断中样本内因果效应估计的统计普适性给予更多考虑的要求日益增长;这两者都涉及在基于模型呈现研究结论时,对模型不确定性及更广泛不确定性的考量。鉴于基于模型的调整永远无法完美实现,我们主张,针对描述性推断和因果推断论证的定性偏倚风险报告框架应进一步发展,并由期刊和资助机构强制要求实施。只有在明确陈述统计论证局限性的前提下,研究结论的使用者才能完全判断其对于任何特定应用的价值。