Towards a unified approach to formal risk of bias assessments for causal and descriptive inference

Statistics is sometimes described as the science of reasoning under uncertainty. Statistical models provide one view of this uncertainty, but what is frequently neglected is the 'invisible' portion of uncertainty: that assumed not to exist once a model has been fitted to some data. Systematic errors, i.e. bias, in data relative to some model and inferential goal can seriously undermine research conclusions, and qualitative and quantitative techniques have been created across several disciplines to quantify and generally appraise such potential biases. Perhaps best known are so-called 'risk of bias' assessment instruments used to investigate the likely quality of randomised controlled trials in medical research. However, the logic of assessing the risks caused by various types of systematic error to statistical arguments applies far more widely. This logic applies even when statistical adjustment strategies for potential biases are used, as these frequently make assumptions (e.g. data 'missing at random') that can rarely be empirically guaranteed. Mounting concern about such situations can be seen in the increasing calls for greater consideration of biases caused by nonprobability sampling in descriptive inference (e.g. in survey sampling), and the statistical generalisability of in-sample causal effect estimates in causal inference. Both of these relate to the consideration of model-based and wider uncertainty when presenting research conclusions from models. Given that model-based adjustments are never perfect, we argue that qualitative risk of bias reporting frameworks for both descriptive and causal inferential arguments should be further developed and made mandatory by journals and funders. It is only through clear statements of the limits to statistical arguments that consumers of research can fully judge their value for any given application.

翻译：统计学有时被描述为不确定性下的推理科学。统计模型提供了这种不确定性的一种视角，但经常被忽视的是不确定性的"隐形"部分：即一旦模型被拟合到某些数据后便被假定不存在的部分。相对于特定模型和推断目标，数据中的系统误差（即偏倚）可能严重破坏研究结论，各学科已发展出定性与定量技术来量化和评估此类潜在偏倚。最为人熟知的或许是医学研究中用于评估随机对照试验可能质量的所谓"偏倚风险"评估工具。然而，评估各类系统误差对统计论证所造成风险的逻辑具有更广泛的适用性。即使在使用针对潜在偏倚的统计调整策略时，这种逻辑仍然适用，因为这些策略通常基于（例如数据"随机缺失"等）难以通过经验保证的假设。对描述性推断中非概率抽样所致偏倚（如调查抽样中）以及因果推断中样本内因果效应估计的统计可推广性日益增长的关注，正体现了对此类情境的担忧。这两者都涉及在呈现基于模型的研究结论时，对模型相关及更广泛不确定性的考量。鉴于基于模型的调整永不可能完美，我们认为应进一步发展针对描述性与因果推断论证的定性偏倚风险报告框架，并由期刊和资助机构强制推行。唯有通过清晰阐明统计论证的局限性，研究成果的使用者才能充分评估其在特定应用场景中的价值。